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Preamble 


Writing an introduction to a ‘new’ book on rhotics appears quite an awkward 
task, especially if one harbours hopes to present new data and to envisage 
perspectives on the topic, as the subtitle to the volume suggests. Is there really 
anything new about rhotics? 

Even from a quick overview of the contributions collected, the answer is 
definitely positive. Although phoneticians, above all, have made great progress 
in understanding the articulatory, acoustic and perceptual characteristics of 
rhotics and their exceptional variation (Recasens & Espinosa 2007; Engstrand 
et al. 2007; Proctor 2009; Lawson et al. 2011), the /r/ family still remains an 


anomalous case as a class of sounds for many well-known reasons: 


a) ‘The puzzling nature of their phonological representation (Wiese 2011); 

b) ‘The unusually wide range of variants (not infrequently within the very 
same phonological system); 

c) ‘The tendency to take on flexible social meanings. 


‘The papers collected in this book thus clearly represent a step further towards a 
better understanding of rhotics in at least two ways: firstly, new data are collected 
on /r/ in many non-European languages, some of them coming from poorly (or 
not at all) described languages; secondly, different disciplinary standpoints are 
taken up in order to capture the kaleidoscopic /r/ phenomenology. 

‘The primary goal of having descriptions of many languages is to document how 
/r/ is articulated and varies within distinct phonological systems. A twofold 
secondary aim is (a) to establish an empirical base for cross-linguistic and 
typological comparisons (b) which in turn could be used as a benchmark to take 
stock of theories or generalizations of human spoken communication (language 
sound systems). As a consequence, this book brings together articles that 
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examine various aspects of rhotics in fifteen languages (or language varieties), 
namely: 


-  Saraiki (Indo-Aryan language spoken in South Punjabi; in Syed); 

- Malayalam (Dravidian language spoken in southern India; in Scobbie et al.); 

-  Washili Shingazidja (Bantu language; in Patin); 

- Modern Hebrew (Cohen); 

- Greek (Baltazani & Nicolaidis); 

- British English (Syed) and American English (Rieira & Romero); 

- Dutch (Van de Velde et al.); 

- German (Hoole et al.) and Tyrolean (South Bavarian German dialect, 
Spreafico & Vietti); 

- Slovak (Hoole et al.); 

- Romanian (Savu); 

- Canadian French (van't Veer; Sankoff & Blondeau) and French (Hoole et al.); 

- Italian (Spreafico & Vietti; Romano). 


On the other hand, /r/ and related phenomena are captured under different 
theoretical and methodological perspectives, following the tradition of previous 
r-atics workshops. Mechanisms and strategies of first (Syed) and second (van 't Veer) 
language acquisition, ultrasound-based comparison in bilinguals (Spreafico & Vietti), 
acoustic (Savu) and kinematic analysis of articulation of /r/ (Scobbie et al.; Hoole 
et al.), phonological interpretation of allophonic variation (Patin) or phonological 
processes (Cohen), socio-geographical representation of language variation under 
a diachronic angle (Van de Velde et al; Sankoff & Blondeau; Romano), all taken 
together depict an enlightening and multifaceted image of r-sounds. 

In the next section, the contributions are grouped according to the main 
perspective or scientific framework. The most insightful general questions 
emerging from the analysis are also reported and emphasized, in order to 
illustrate the range of transversal issues connecting papers to each other as well 
as connecting them all to less superficial issues related to the interaction between 
phonetics and phonology. 


2. Language acquisition and bilingualism 
'[he three contributions that fall within the broad framework of (first and 


second) language acquisition and bilingualism are from Van 't Veer; Syed and 
Spreafico & Vietti. 
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‘The first paper by van 't Veer explores the hypothesis of /r/ being featurally 
underspecified or not specified at all for place of articulation. Ihe author refers 
to data from a study published by Rose (2000, 2003; data are available on 
the CHILDES phonetic database) and contrasts them with typological and 
diachronic evidence in the literature. He reports on two different patterns of 
L1 acquisition by 2 children. The first seems to categorize French /r/ more in 
terms of intrinsic phonetic properties (namely as uvular fricative) and partly 
discarding the phonotactic distribution of the phone. The second child picked 
up phonotactic information more as adult speakers do, thus classifying /r/ as a 
rhotic, and consequently not specifying it for PoA. The author adds to Rose's 
analysis an explorative acoustic examination on a very limited set of tokens, 
aiming to compare the two speakers' productions and to search for differences in 
the acoustic output. The results point towards a similar production of /r/ in both 
children, therefore opening again a number of questions on the nature of dorsal 
/R/ phonological representation. What information is more easily recoverable 
from the input in ambivalent phonemes, distributional or segmental? Could this 
case support, as the author suggests, a view of phonology as substance-free in 
which abstract representations are partly detached from acoustic information? 
Syed investigates the patterns of acquisition of English [1] by Pakistani learners. 
The perceived phonetic distance is used to measure the similarity of English 
[1] to the neighbouring sounds in the English inventory as well as in Saraiki 
consonant system. The Speech Learning Models principle of equivalence 
classification (Flege 1995) is tested on a sample of 90 learners of English with 
varied competence and exposure to the L2. In accordance with the perceived 
distance between phones, a developmental pattern emerges from the analysis: 
English [1] is acquired by learning to discriminate it from L2 [1] in the first 
place, then from L2 [w] and finally from Saraiki [r]. 

In their contribution, Spreafico & Vietti explore the articulatory properties of /r/ 
in simultaneous and sequential Tyrolean-Italian bilinguals. Using the ultrasound 
imaging technique, they examine whether adult bilinguals display different 
tongue shapes for rhotics in each language they speak and whether bilinguals’ 
articulatory patterns in each language are similar to those used by almost 
monolingual speakers or not. The results show that very late sequential bilinguals 
(for the sake of simplicity read here ‘almost monolinguals’) do not present distinct 
lingual shapes for rhotics in the two languages, while the simultaneous bilinguals 
do. Moreover, inter-speaker comparison indicates that articulatory patterns for 
rhotics used by simultaneous bilinguals differ from those used by the very late 
sequential bilingual speakers who are used as control subjects. To sum up, late 
sequential speakers transfer their /r/ from L1 to L2, whereas simultaneous 
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bilinguals distinguish rhotics in the two languages, even if their rhotics are 
articulatorily different from those of the late sequential speakers. From the study, 
further directions of investigations need to be pursued: the articulatory means of 
phonological contrast within and between languages in bilinguals, the complex 
intertwining between articulation, acoustics and perception, and finally the role 
of sociophonetic factors in /r/ variation in simultaneous bilinguals. 


3. Phonetics and phonology 


Studies in the field of experimental phonetics play a major role in the structure 
of the book: three of them (Hoole et al; Scobbie et al; Baltazani & Nicolaidis) 
present innovative and insightful evidence on the articulation patterns of rhotics 
in German, French, Slovak, Malayalam and Greek using UTI, EMA and EPG 
data. The following two contributions (Savu; Rieira & Romero) provide an 
acoustic analysis ofthe effects of coarticulation on the structure of/r/ in Romanian 
and American English. The last two papers, belonging to this section, are more 
phonologically oriented: one proposes a CVCV phonology interpretation of /r/ 
allophonic variation in Washili Shingazidja, the other is an O T account for some 
idiosyncratic phonological processes in the loanword phonology of Hebrew. 
Hoole et al's paper focuses on the kinematic properties of rhotics as a special 
case of gestural coordination of consonant with consonant and consonant with 
vowel. Two sets of EMA data are presented. In particular, the first study explores 
the characteristics of /kr/ clusters in German and French if compared to other 
obstruent-sonorant clusters, namely /kl/ and /kn/ clusters. The low overlap in 
plosive-rhotic clusters is discussed as a potential source of diachronic instability 
which could in turn be conducive to metathesis. Articulatory synthesis is also used 
to explain further the reason for the low overlapping. 

The second study provides an analysis of syllabic liquids /l/ and /r/ in Slovak. To 
begin with, the kinematic properties of the liquids are examined, as a function 
of the position in the syllable, then an analysis of the articulatory coordination 
patterns is carried out. The remarkable results coming up from data examination 
are the following: 


a) There is no consistent difference between liquids in onset, coda or 
nucleus: kinematically speaking they are still consonants; 

b) Liquids present a lower overlap with the preceding C when they are in 
nucleus position than in onset; 

c) Nucleus /r/ shows less overlap than vowel or /l/ as syllable nuclei. 
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‘The authors discuss the phonological implications of the results by 
hypothesizing that syllabic consonants are typologically infrequent because they 
require a coordination pattern which is different from the default CV pattern. 
Consequently, in Slovak it is possible to have syllabic liquids because in absolute 
terms consonant-consonant coordination shows a low overlap, thus suspending 
"the basic principle of a continuous vocalic substrate with overlaid consonant 
constrictions”. The general aim of the research is to study the emergence and 
development of sound patterns as a function of the patterns of articulatory 
coordination. 

The contribution of Scobbie et al is a high-speed ultrasound imaging 
investigation of the phonemic system of liquids in Malayalam, a Dravidian 
language spoken in southern India. Malayalam represents an interesting case 
study for many reasons: on the one hand there is a complex system of contrasts 
in the liquids based both on primary and a secondary articulation (clear-dark 
resonances), on the other hand it works as a ‘natural laboratory’ to assess the 
potentialities and limits of the UTI technique to detect basic lingual properties 
of phonological distinctiveness. In accordance with previous acoustic studies 
on Malayalam and instrumental articulatory research on Tamil and Kannada, 
they carefully document the system of contrasts in general and the ambivalent 
properties of the fifth liquid in particular. Exploring the static and dynamic 
characteristics of the five liquid phonemes, the authors raise a valuable range of 
questions and conjectures for future research. Among these, the following issues 
deserve to be mentioned: 


- The multifarious role of tongue root (a) in the resistance to coarticulation, (b) 
in the production of trills, (c) as an articulatory correlate of dark resonances; 

- The unreported dynamic properties of the fifth liquid (post-alveolar 
approximant with frication), so called, by the authors, zig-zag movement; 

-  '[he need to take into account the phonological ambivalence of certain 
segments as part of the phonological competence (and not as an aberrant case). 


Baltazani & Nicolaidis present an acoustic and articulatory (EPG) analysis of the 
Greek tap, which appears to be the dominant allophone of /r/ in many prosodic 
contexts (and precisely in /Cr/ and /1C/ clusters, between vowels, but also in 
singleton phrase and word initially). The presence of a vocalic element, together 
with the ballistic contact gesture, is interpreted here as an essential part of the 
sound structure of the rhotic (as in Savu's contribution), rather than as an effect 
of the gestural overlap between two consonants in CrV contexts (as suggested in 
Hoole et al.). Following this last line of reasoning, the effect of the overlapping 
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might be the popping up of the vocalic nucleus between the two consonants, but 
the authors provide proof against this account, at least in Greek, observing the 
occurrence of a vocoid in absolute initial position (#rV ), where there is no other 
consonant to overlap with. The acoustic measurements show that the vocalic 
elements are longer than the constriction phase and their vocalic quality reflect 
the formant values of their corresponding nuclear vowels, only more centralized. 
Integrating the acoustic investigation, the EPG data provide evidence for a 
classification of taps into two categories with a complete or an incomplete 
closure. This distinction could suggest a view of taps as steps in a continuum from 
prototypical (fortis) taps to lenis taps to more vocalic realizations, as in a ladder 
towards a potential language change from (trills to) taps to approximants. 

In a similar way, Savu explores the phonetic structure of taps in Romanian. The 
author puts forward the hypothesis that the phase of constriction is surrounded 
by two vocalic elements (not one as in Baltazani & Nicolaidis), which she 
considers components of a tap and not as intrusive or epenthetic vowels. Thus, 
the structure of a tap is made up by a sequence like vocoid+constriction+vocoid, 
more evident in #rV, Cr and rC contexts. The primary aim of the study is to 
measure formant structure and duration of the vocalic elements in order to 
establish the range of variation. In addition, a secondary and original goal is to 
investigate a possible resemblance between the vocoids in Cr and rC contexts 
and those in VrV context. The preliminary results show that vocalic elements 
bordering the tap closure tend to approach the quality of the syllabic vowels, 
even if still positioning themselves in a mid-high central to front area. In order 
to further prove the structure of the tap, as the one proposed in the paper, more 
evidence should be added by (a) quantifying the coarticulation effects in VrV 
sequences, (b) observing the behaviour of taps in contexts when there are no 
vowels on either side (like #rC, CrC and Cr#) and they function as syllabic 
nucleus as in languages like Czech or Serbo-Croatian. 

‘The role of a transitional vocalic element in Vr sequences is discussed within a 
different framework by Rieira & Romero. Mutatis mutandis, the hypothesis is 
again to prove that the vocoid should not be considered as a vocalic epenthesis, 
and consequently as the result of a phonological process, but rather as an 
unstable targetless transitional element affected by coarticulatory forces. 

‘The study contains an acoustic analysis of Vr contexts in American English 
in slow and fast speech. In the first place, segmentation procedures, based 
on the identification of flexes in the formant curves, are used to divide the 
sequence into three components: the vowel, the transitional element, the rhotic 
consonant. Next, durational and formant structure information is measured for 
the three components, focusing on stressed monosyllables. As a final point, an 
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ANOVA analysis is carried out to test for the hypothesis of variation of the 
three components in relation to speech rate and vowel contexts. The effect of 
coarticulation is confirmed by (a) the variation of the schwa-like element as a 
function of the context and (b) by the influence exerted by the speech rate (even 
if data in the latter case are reported only for one exemplar speaker). 

‘The following two contributions aim at giving a phonological explanation to the 
somehow anomalous behaviour of rhotics in Washili Shingazidja and Modern 
Hebrew. 

The first study by Patin provides a detailed description of /r/ allophonic 
distribution in Washili Shingazidja, a Bantu language spoken on Grande 
Comore (one of the five Comorian islands). The data are collected from a single 
speaker. In the basic allophonic pattern a trill [r] alternates to a tap [r]: the trill 
appears in initial position (and, apparently, mainly in Arabic loanwords) and 
the tap in intervocalic position (also across a word boundary). Ihe distributional 
scheme becomes complicated by the presence of a preceding consonant, which 
triggers a trill, or a syllable with no high tones, which favors an approximant. 
‘The overall allophonic variation is accounted for within the CVCV phonology 
framework. Basically, the author suggests that a trill in absolute initial position 
corresponds to an underlying geminate and offers three arguments in support 
of his hypothesis: 


a) ‘The geminate is the result of a process of assimilation of the determinant in 
Arabic loanwords (e.g. a[r]uh < (a)l-ruh ‘the soul’); 

b) ‘The initial trill cannot be considered the consequence of fortition (which is 
normally associated to a voiceless retroflex); 

c) The geminate is likely to occur in casual speech as a result of a vowel deletion 
(e.g. [r]i[e]i ‘we played / we feared’ > [r]i mpi[1]á ‘we played a game’). 


However, the author admits that the CVCV phonology preliminary 
interpretation fails to account for the whole distribution pattern, such as the tap 
realization before [i], or the occurrence of the trill between a consonant and [i]. 
In addition, probably a wider sample of speakers is needed to gain a clearer idea 
of the relative weight of loanwords in the phonological process. 

‘The second study by Cohen begins with the observation that phenomena not 
supported by the native Hebrew grammar seem to occur when /r/ is involved 
in loanwords from English into Hebrew. In particular, two phonological 
processes, reduplication (which is morphologically productive) and metathesis 
(not systematic) are likely to interact with the presence of /r/. In the process of 
adaptation of /r/ in loanwords, exceptional (read not part of Hebrew phonology) 
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prosodic phenomena appear: on the one hand /r/ is metathesised from coda 
to onset (e.g. &ornfleks > kronfleks), on the other hand a pseudo-reduplication 
process move /r/ from onset to coda to create pseudo-reduplicative patterns, as 
in proportsja > porportsja. 

‘The author proposes an account within Optimality Theory to give an explanation 
to what looks like the emergence of universal grammar constraints. He assumes 
the stratified lexicon hypothesis, according, to which the lexicon is divided into a 
core and a periphery with partially different phonologies (Paradis & LaCharité 
1997). Therefore, constraints which are relevant to explain loanword adaptation 
may not be applicable to native words phonology. 

To explain metathesis, “Copa-r (a sub-specification of “Copa) is proposed. 
This constraint outranks Max, LINEARITY,,,, (native/loanwords) and "Cx (no 
complex syllable margins) and move /r/ from coda to onset. The optimalistic 
explanation formulated is not totally satisfactory when pseudo-reduplication 
comes into question. In that case, the same set of constraints plus Repur does 
not produce the correct output, as in proportsja that should be “proprotsjia 
instead of the actual winner, which is porportsjia (with *Copa-r violated). Even 
if arguable, the contribution raises a significant question: is really “Copa-r a 
universal constraint? What kind of typological evidence do we have? It cannot 
be our ambition to answer these questions here, but a remarkable connection 
could be traced to the paper by Hoole et al., in which kinematic evidence for 
metathesis is reported as a consequence of low overlap in CrV sequences. 


4. Language variation and change 


The papers contained in the last section deal with the social and geographical 
variation of /r/ in three different areas: Romano presents data on the variability of 
rhotics in Italy; Van de Velde et al. analyse geographical variation in a diachronic 
perspective on the Dutch dialect in Flanders; likewise Sankoff & Blondeau report 
on a sound change in progress in Montreal French. It should be noted, however, 
that Romanos and Sankoff & Blondeau's papers are (up to now) unpublished 
studies from the 7-atics-2 conference, thus dating back to ten years ago. As they 
still represent valuable contributions and missing pieces of evidence in the debate 
on rhotics, the two articles find their natural place within the structure of the book. 
Romano’ study is an accurate description of the allophonic distribution of /r/ 
in Standard Italian, as well as a detailed illustration of the socio-geographical 
variability of rhotics all over the Italian peninsula. The basic standard patterning 
is defined as an alternation of a trill and a tap, with the tap occurring only in 
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intervocalic unstressed syllables (e.g. raro [raro] < /raro/ ‘rare’), and the trill in 
the remaining contexts. Next, a wide range of coronal and dorsal variants are 
identified and classified as geographical (thus belonging to a geographical variety 
of Italian), social, idiolectal or pathological. As underlined by the author, further 
research on the articulation of /r/ in Italian as well as on the sociolinguistic 
meanings attached to rhotics is still needed (ten years ago, as today). 

The last two papers examine the problematic sound change from apical to 
uvular /r/. 

Van de Velde, Tops & van Hout discuss the socio-geographical spreading of 
uvular /r/ in Flemish Dutch over a span of almost ninety years (from 1922 
to 2009). The authors analyse three sets of data, two coming from traditional 
dialectal surveys and one collected with a more sociolinguistic approach. The 
combination of geographical and social methods proves to be an excellent 
instrument to capture the dynamics of a sound change. In our opinion, to 
implement geographically-based rapid and anonymous surveys could represent 
a new perspective for a multidimensional documentation of language variation 
and change in Europe. This could be especially true if we aim to re-draw a 
map of the spreading of uvular /r/ across Western Europe. Coming back to the 
contribution, the results show an ongoing change from apical to uvular /r/ in the 
Flanders, in particular among the younger generations. Interestingly enough, 
it must be remarked that in a context of considerable social-geographical 
variability (e.g. twelve variants are registered in the RAS study), individual 
speakers are not likely to alternate front and back rhotics. 

A similar finding is described in Sankoff & Blondeau’s paper, reporting a 
sound change in progress in Montreal French from apical to uvular /1/. These 
strands of independent evidence (supported also by the study of Vietti & 
Spreafico 2008) seem to lead to the conclusion that some sound changes, at 
least at the individual grammar level, ought to be categorical, while others, 
like vowels shifts for instance (see recently Harrington 2006), have to be 
incremental in their nature. In their paper, Sankoff & Blondeau analyse in 
particular the sociolinguistic behaviour of two speakers that show a pattern 
of [r]-[R] variation (contained in the interval between 20 and 80 96), in order 
to understand phonological and stylistic factors of conditioning. The process 
of change from a variable to a categorical use of [R] passes through a phase of 
prosodic conditioning that favours the occurrence of uvular /r/ in syllable coda. 
On the other hand, it still remains unclear what stylistic reasons are affecting 
speakers’ choice towards apical or uvular /r/. 
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5. On the vital importance of being variable 


As this introduction illustrates, the book's aim was not to unravel the complex 
question of the phonological unity of /r/, but rather to purposefully pursue 
the empiricist idea of offering data-based descriptions of /r/ in many different 
languages and, possibly, from different theoretical angles. ‘Therefore, the 
collection of papers taken as a whole reflects the understanding that in order 
to explain the variability of /r/ a broad cross-linguistic framework is needed (as 
proposed in Lindau 1985 as an example). In addition, most of the papers share 
another basic feature of the empirical view, namely the experimental context of 
data collection and the instrumental method of analysis. 

‘Thus, if the aim appears to be very elementary in its nature, the combination 
of experimental method and cross-linguistic perspective may nevertheless 
lead to some important consequences for a phonology of rhotics. First, the 
evidence coming from articulatory data (notably EMA and UTI) shed a 
new light on the characteristics of the class of rhotics both in terms of static 
configuration and dynamic behavior (see for instance the coordination patterns 
of rhotics in consonant clusters). Moreover, and especially regarding UTI, the 
rich representation of lingual shapes and movements implies a change in the 
received categories of the sounds' articulation, and consequently it fosters the 
reformulation of the current terminology. 

Second, the adoption of a cross-linguistic framework has several advantages, 
which in a very straightforward way allows us to: 


a) ‘Test hypotheses and/or descriptions based on the most scientifically 
investigated languages (the varieties of English being the first); 

b) Compare /r/ segmental features and coordination patterns as well as 
allophonic distributional patterns; 

c) Verify whether the tendency of rhotics to take on sociolinguistic meanings 
is cross-linguistically consistent. 


Taken all together, the three points address the general topic of the role of 
within system /r/ variability as a constant component in the sound systems of 
the world's languages, thus showing the importance of such a variable class of 
sounds as a functional and vital element in a fully fledged phonological system. 
As this introduction attests, the book raises many issues. We hope these issues 
will be as much a source of inspiration to everybody working on rhotics as they 
have been to us. 
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Abstract 

In this paper, we discuss the acquisition of /s/ for two children acquiring French, for one of 
whom, /s/ triggers within-cluster assimilation of coronal obstruents. This is conspicuous, 
as French has a placeless rhotic. Accordingly, the rhotic of the other child is the target of 
place assimilation. Rose (2000, 2003) attributes the difference to the fact that the French 
rhotic is phonetically fricative-like, whereas it behaves — phonotactically — like a liquid. 
Hence, two possible sources of information for the acquiring child contradict each other. 
We discuss cross-linguistic evidence for and against place-bearing rhotics, concluding 
that both possibilities exist. To see to what degree the /&/ is the same in the two children, 
we present an acoustic study, after which we demonstrate a reconstruction of the possible 
path of acquisition of Théo. Finally, we discuss the relevance of phonetic measurement 


for phonological patterns. 


1. Introduction 


In deciding which features to use when storing words and their segments, 
children must reconcile multiple sources of evidence. For one thing, phonetic 
similarity and distributional properties play an important role (Maye & 
Gerken 2000; Maye Werker & Gerken 2002; Maye & Weiss 2003). On the 
other hand, we know that children are sensitive to the phonotactic patterns 
of their surrounding language from the age of nine months (Saffran & 
Thiessen 2003). In some instances, these two sources provide contradictory 
cues. 

One such case is French /s/. Phonetically a fricative, or at least very fricative- 
like (see, for example, Rose 2000:8), phonotactically it patterns with the other 
liquid in the language, /l/. Thus, learners of French must find some way to 
weigh these two conflicting sources of evidence in such a way as to arrive at an 
adult-like grammar. 
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It is not surprising that children have difficulty with this. Rose (2000) describes 
two learners of French, Clara and Théo, who have differing acquisition patterns 
when it comes to /s/. What is especially striking is that Théo’s /Cz/ onset clusters 
display a very robust pattern of cluster-internal dorsal assimilation, where the 
/s/ is the trigger and coronal obstruents are targets. This is a highly remarkable 
pattern, because cross-linguistically, there are very few cases where rhotics are 
specified for place of articulation, let alone where they trigger assimilation. In 
his analysis of these data, Rose (2000: chapter 5) proposes that the differences 
between the development of /s/ observed in the two children stem from 
different underlying representations: Clara's rhotic is placeless, whereas Théo 
has posited an underlying feature [dorsal] for his /&/'. Rose (2000) attributes 
this difference to the phonetics of French /s/, namely that it is a uvular across 
the board (Rose 2000:244-5, 261), and uvular consonants can be analyzed as 
[dorsal] (Rice 2011). This idea is further expanded upon in Rose (2003), where 
the author points to the fact that in adult (Québec) French, the rhotic often 
surfaces as a uvular fricative in branching onsets (where the head is a voiceless 
obstruent). 

Based on the data observed and the analyses proposed in Rose (2000, 2003), it 
would appear a viable option that Clara's initial hypothesis is that /s/ is a liquid, 
whereas ‘Théo’s initial hypothesis might be that it is an obstruent. This would 
imply that Clara places more emphasis on the phonotactic evidence, and 'Ihéo 
more on the phonetic evidence (see also Rose 2003:428). In this paper, we will 
attempt to see if we can find evidence for different representations in the acoustic 
signature of the rhotics of both children. We will follow Rose's hypothesis that 
the phonetics of French /s/ contradict its phonotactic distribution, and that 
this is the reason for the difference. Whereas Rose (2000, 2003) focuses mainly 
on the phonetics of Place of Articulation, we will also consider the manner 
specification of /&/ in the respective grammars of both children. 

In the next section, we will briefly go over some typological data to see whether 
we can find cross-linguistic evidence for either placeless or place-bearing 
rhotics. In section 3, we will show the acquisition pattern of the two children's 
rhotics in more detail. Section 4 presents a tentative acoustic study, and section 
5 concludes. 


E It should be noted that there are more differences between the two children regarding their /s/; reasons of 


space prevent us to go into much more detail, but see section 3 below, and Rose (2000) for a full 
description. 
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2. A typology of rhotics and Place of Articulation 


Liquids are among the most elusive and difficult-to-understand of all phonemes. 
With regard to rhotics, phonological discussion focuses mainly on the feature 
specification. One area of disagreement is whether rhotics have a Place of 
Articulation (henceforth: PoA) specification. The most radical proponent of 
rhotic PoA is Walsh Dickey (1997), who goes so far as to say that rhotics are 
universally defined by a specific PoA specification (that is, all rhotics have a 
secondary laminal node). The other position has less radical proponents, but 
deserves consideration nonetheless. In the following sections, we will review 
some evidence for and against both positions. If not indicated otherwise, the 
examples are from synchronic phonology. 


2.1 Placeless rhotics 

Despite Walsh Dickey’s proposal, rhotic placelessness appears to be the default 
position in the literature. In this section, we will review some of the reasons why 
this is so. 

A general indicator of the presence of a feature in an underlying representation 
is that the segment it belongs to displays some phonological behavior (the 
‘natural class’ argument). Hence, we will look at the phonological behavior of 
rhotics. In general, rhotics do not trigger any alternations involving PoA (but 
see section 2.2 for some counter examples). In addition, they often escape rules 
that otherwise trigger PoA, such as coda place assimilation: 


(1) Place assimilation in Italian 
a. Diachronic obstruent-obstruent assimilation: cognates 


English Italian < from Latin 
fact fatto < factum 
abdomen addome < abdomen 


b. Nasal assimilation: morpheme boundaries 
in+portante > importante 


c. Liquids in codas 


Italian English 
aperto open 
arco arch 
salto jump 
alcol alcohol 
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As 1c illustrates, both laterals and rhotics escape the coda condition on place. 
Something similar arises if we consider the distributional properties of the 
word-final consonant in Italian; in general, Italian has no word-final codas. 
However, in some function words, such as per ‘for, to, through’, consonants 
appear word-finally. Most of these cases involve liquids. We can attribute both 
patterns to a general ban on independent PoA in codas; the ban does not apply 
to rhotics because they are placeless. 

Another indicator for placelessness is found in Backley (2011), who notes that 
the rhotic in (British) English stands in a similar relation to /a/ as /i/ to /j/ 
and /u/ to /w/, in that they group together in glide/liquid alternations. Backley 
proposes that the rhotic in English is a glide, and its vocalic counterpart is /a/. 
This low vowel is usually thought of to be underspecified for place (Hall 2011), 
and by analogy, the same would hold for the rhotic. 

In a study of onset cluster phonotactics in Germanic languages, Goad & Rose 
(2004) note an asymmetry in the distribution of clusters where the obstruent 
is coronal. Consider the onset cluster inventory of Dutch, for example, in 
Table 1. Laterals can cluster with coronal fricatives, but not stops, and rhotics 
can cluster with coronal stops, but not fricatives. The analysis that Goad & 
Rose (2004) propose rests on two points: laterals are [coronal], whereas rhotics 
are placeless. Second, there is a difference between ‘real’ onset clusters and 
‘apparent’ onset clusters: whereas most obstruents in onset clusters are actually 
in the onset constituent, this does not apply to /s/, which is syllabified as an 
appendix?. The absence of /tl/ in this inventory, Goad & Rose (2004) argue, 
is due to a restriction on identical places of articulation in a cluster (/po/ onset 
clusters are also banned in Dutch, as are /kx/ clusters; the same constraint 
holds in German and English). If so, the reason /tr/ is licit must be that the 
rhotic is not coronal. The ungrammaticality of /sr/ clusters is due to the fact 
that an appendix must be licensed, and can only be licensed by a ‘strong’ onset. 
‘The rhotic is not strong enough, because it lacks a place specification. 


/pr/ /tr/. /kr/ 
/p/ "d/ /kl/ 
/fr/ Ysr/ /Xx/ 
/ü/ /sd /XV 


Table 1 — Onset cluster inventory of Dutch. 


stop * liquid 


fricative + liquid 


? The strange behavior of sC clusters is one of the most famous problems in phonology. See Goad (2011) for 


an overview. 
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A similar situation holds for English (which also bans /01—/) and German (the 
German case is slightly different as the language allows for /fr/ onset clusters. 
See Goad & Rose (2004: section 4.3) for details — which however do not change 
the analysis of the rhotic as placeless. 

Rose & Demuth (2006) show that in English and Afrikaans loan word 
adaptation in Sesotho, epenthetic vowels breaking up illegal onset clusters 
obtain their place of articulation from their leftward environment. In word- 
initial context, the left environment is constituted by the first member of the 
cluster — in other words, a consonant. Word-medially, where there is a vowel 
to the left of the cluster, it is the vowel that supplies the feature. A number of 
exceptions to this pattern exist, however. First, dorsal consonants do not supply 
a place of articulation to the vowel. Secondly, the vowel /a/ is only copied if no 
other source is available (Rose & Demuth 2006: section 3.3). What is most 
important for our present purposes is that neither /l/ nor /r/ ever supply a place 
of articulation feature to an epenthetic vowel. In (2) there are some examples’. 


(2) Epenthetic vowels in Sesotho loanword adaptation 
a. Word-initial clusters: left-to right from consonant 
Lab«Liq blik [blik] [boleke] “tin can/dish’ 
Cor+Liq troon [truwn] | [tironr] ‘thrown’ 


b. Word-initial dorsal-initial clusters: right-to-left from vowel 


Dor+Liq kroon [kruwn] [koroni] ‘crown’ 
krip [krip] [kirrpi] ‘crib/manger’ 
c. Word-medial clusters: left-to-right from vowel 
Lab+C hops [hops] [hopose] ‘drink made from hops’ 
Cor+C football [futbol]  [futubol] ‘football 
Dor+C box [boks] [bokose] *box/case' 
d. Word-medial cluster preceded by /a/: left-to-right from consonant 
Lab«C sambreel [sambne:l] [samporelr] ‘umbrella’ 
Cor«C address [adres]  [aturese] ‘address’ 


e. Word-medial Lic+C clusters, preceded by /a/ followed by non- 
a-vowels: right-to-left from vowel 
Liq*Cor kartjie [kantji] [kariki] ‘cart’ 


? — The examples are all taken from Rose & Demuth (2006). I have adopted their transcriptions. 
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It would be going too far at this point to reproduce the entire analysis developed 
in Rose & Demuth (2006). What is important, however, is that the rhotic in 
Sesotho (and the lateral) behaves as a placeless segment in loan word adaptation, 
as we can see in example (2e). 

A final example of an argument comes from van Oostendorp (2001). Discussing 
two dialects of Dutch, van Oostendorp (2001) argues that /r/ is not only placeless, 
but actually featurally empty. In the brabantic dialect of Tilburg, /r/ patterns with 
fricatives word-finally, and with sonorants elsewhere. More directly related to place, 
in Maasbracht Dutch (Limburg), a contrast exists between falling and ‘dragging’ 
tone. Tone is realized on the main stressed vowel of the word, but minimal pairs 
exist only for rimes consisting of long vowels and vowels followed by sonorants: 


(3) ‘Tonal minimal pairs in Maasbracht Dutch 
a. Long vowels and sonorants 
falling tone dragging tone 
bi: ‘bee’ bi: ‘at’ 


min ‘minus min ‘vile’ 


b. Obstruent-final rimes 
falling tone dragging tone 
pt ‘kernel’ — 
zok ‘sock — 


This contrast exists because sonorants can be moraic, whereas obstruents cannot, and 
falling tone is represented by a single high tone on the nucleus, whereas dragging tone 
consists of two high tone features. Rhotics display a dual behavior. Word-internally 
they pattern with sonorants, in that rhotic-final rimes can have falling as well as 
dragging tone, but word-finally, they behave as obstruents: no tonal contrast exists. 


(4) Dual tonal patterning of rhotics in Maasbracht Dutch 
a. Word-internal rhotic final rimes 


falling tone dragging tone 
sperma ‘sperm’ firma ‘firm’ 
'eryor ‘worse’ ‘eryar ‘annoy’ 


b. Word-final rhotic-final rimes 


falling tone dragging tone 
ker ‘car’ — 
ver ‘far’ — 
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In this respect, rhotics pattern exactly like /n/, the placeless nasal (see van 
Oostendorp 2001 for references, and Rice 1996 for arguments for the 
placelessness of /1/). 


2.2 Rhotics with PoA 

Our first example in which rhotics display evidence of a place of articulation 
feature comes from Selayarese (Mithun & Basri 1986). In morphological 
reduplication, word-final velar nasals assimilate to the adjacent onset. Consider 
the examples in (5)*: 


(5) Reduplication in Selayarese 


pekan -= pekampekar 
soror) x soronsoroi) 
janan 2 jaganjagar 
kelon - kelonkelon 
rongan = ronganrongar 


The telling example here is the final one, in which the nasal surfaces as a coronal 
if followed by the rhotic. The pattern holds over word boundaries, as can be seen 
in the following examples involving the numeral annan ‘six’. 


(6) annam poke 
annan tau 
annapjarar 
annar golo 
annan rupa 


Thus, Selayarese exemplifies the possibility for rhotics not only to have a PoA, 
but an active one, too. 

Selararese is not alone in this respect; Chukchee (Lewis 2009) also has a pattern 
wherein a velar nasal assimilates to the following onset. Blevins (1994) gives the 
following examples: 


Apart from the role of the rhotics, this process is also interesting in connection to van Oostendorp (2001)’s 
argument for phonological placelessness of 1. See also Rice (1996) on the relation between coronals and 
velars, and why they are both often seen as 'unmarked' or ‘default’; one possible alternative analysis to the 
one proposed here is that the patterns in 5, 6 and 7 are the result of two interpretations of the same 
underlying, placeless consonant. However, this would leave unexplained the fact that the rhotic patterns 
with the other coronals. 
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(7) Nasal assimilation in Chukchee 


ter-ol?-on = ‘good’ 
tam-pera-k - “to look good’ 
tan-tfottfot = ‘good pillow’ 
tan-]omyol - 'good story' 
tan-rParge - ‘good breastband’ 


Again, we see that the rhotic triggers the phonologically placeless nasal to 
surface as a coronal, whereas its default surface form (as can be seen in the first 
example, tern-al?-an) is velar. 

‘The examples from Selayarese and Chukchee involve a primary coronal place of 
articulation for the rhotic in these languages, but Sanskrit presents us with an 
example of a language in which the rhotic has an active secondary PoA feature, 
namely through the process of retroflexion (see 8): 


(8 n>n / {sl 


‘That is, the rhotic patterns with the retroflex fricative in triggering retroflexion 
on coronal nasals. Some examples are given in 9 (from Avery & Rice 1989): 


(9) Retroflex harmony in Sanskrit 
a. pur- ana 
is - na 
b. ksubh- ana 
krpa — mana 
c. marj— āna 
ksved — ana 


The examples in 9a and 9b show that vowels and consonants respectively are 
transparent to retroflex harmony (perhaps unsurprisingly). The examples in 9c, 
however, show that not all consonants are transparent: coronals block harmony. 
Hence, a straightforward hypothesis is that the triggers are coronals with a 
secondary feature [retroflex]. This entails that the rhotic in Sanskrit is a coronal. 
Finally, let us investigate an example where the PoA feature is not coronal, but 
something else (possibly [back]), and where the evidence is diachronic rather 
than synchronic. In Old English Breaking, front vowels underwent a diachronic 
process of diphthongization when followed by [back] consonants (Baker 2007; 
Barber 1997, among many others): 
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(10) Old-English Breaking description 


[a] or [a] — [æa] or [eo] * consonant 
[e] or [e] > [eo] or [eu] / * continuous +C 
[i] or [1] ^ [ro] + back 


Example 11 lists a number of Proto-Germanic words, and their Old English 
counterparts. The pattern should by now be clear: the rhotic participates in a process 
that is best characterized by crucial reference to a PoA feature (as in 10 above). 


(11) Proto-Germanic and Old English cognates 


Proto- Germanic Old English Contemporary English 
ahta eahta eight 
hertö heorte heart 
hirdijaz hiorde herder 
2.3 Summary 


In this section, we have looked at a number of languages and a number of 
reasons why rhotics should be either placeless or place-bearing phonologically. 
In the end, evidence can be found for either position. However, placelessness 
seems to be the default, as cases in which rhotics are active in place-related 
phonological processes (either diachronic or synchronic) are rare (see also Rose 
2000, 2003). In the next section, we will examine the acquisition patterns of two 
learners of French, who show remarkably different patterns when it comes to the 
rhotic. The case at hand, in which the child exhibits evidence of a place-bearing 
rhotic when the surrounding language does not, raises the question of whether 
the child acquired the sound as a rhotic in the first place. These questions are 
especially relevant with respect to French, with its fricative-like rhotic. 


3. The acquisition patterns of Clara and Théo 


In this paper, we investigate the phonetic contours of the rhotics of two learners 
of Québec French: Clara and Théo (Rose 2000)°, who, for all intents and purposes 
of the present study, are acquiring the same (Eastern) dialect of Québécois. The 
segmental inventory of Québécois French is, as far as consonants are concerned, 
identical to the segment inventory of European French (Rose 2000). 'Ihe 
primary data consist of spontaneous speech, recorded roughly bi-weekly from 


5 


The data and software used in this study are freely available from PhonBank, see http://childes.psy.cmu. 
edu/phon/. 
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age 1;00.27 to 2;08.19 for Clara, and from age 1;10.26 to age 4;00.00 for Théo. 
‘They were first published in Rose (2000), and the observations and examples 
given in this section rely heavily on that work. 

Cross-linguistically, as we have seen, rhotics may either bear a place feature, or 
they may not, where the latter is the unmarked situation. In the Goad/Rose 
corpus of Québec French, we see this variation exemplified. Clara appears to 
represent her /s/ as a placeless liquid from the beginning. She does not show any 
behavior that would indicate otherwise. On the other hand, Théo seems to go 
for the dorsal option. This, as shown by Rose (2000, 2003), we see is evidenced 
in a pattern of dorsal assimilation in branching onsets where /&/ combines with 
a coronal, throughout the entire period for which Théo was recorded. 


(12) Constituent-limited dorsal assimilation in Théo's onsets 
a. Session 1998-11-26 record 149 
orthography: Je suis trop fatigué encore 
target: [zo] [sui] [two] [fatige] [àkox] 
actual: [fy] [sy] [kgo] [fasige] [àk^ox] 


b. Session 1998-11-26 record 47 
orthography: faudrait qu'on ait une patte 
target: [fodxe] [kő] [ne] [vn] [pat] 
actual: [fokwe] [ko] [ne] [vn] ['paxtla] 


. Session 1998-11-26 record 30 
orthography: y'avais en train de sauter 
target: [jave] [à] [tee] [d] [sote] 
actual: [jave] [ke«à] [kse] [d] [sote] 


a 


As becomes clear from these examples, Théo’s rhotic has a dorsal place feature 
that triggers assimilation of coronal obstruents in the same onset constituent. 
There is, however, no phonological evidence (e.g. from spreading, blocking, 
or other phonological phenomena) for any place feature in rhotics in the 
surrounding language. Either his /&/ is phonologically a rhotic with a [dorsal] 
feature, or it is a dorsal fricative with peculiar phonotactic properties. In the first 
case, Théo overspecifies his /&/5, in the second case, he violates a phonotactic 
rule of French: there are no stop-fricative onset clusters’. 


6 — But see Hale & Reiss (2003) for an argument why this could be expected. 


7 Save for some learned exceptions such as psaume ‘psalm’, psychologie ‘psychology’, which are very small in 


number. 
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One clue comes from looking at the timeline of development of Théo's /g/. 
The first instance of /&/ per se occurs relatively late, but the segment surfaces 
target-like from its inception. It occurs in word-final position during the same 
period when other consonants do so. In contrast to Théo's rhotics, which trigger 
assimilation in clusters, Clara's rhotic undergoes place assimilation in singleton 
onsets in early sessions: 


(13) Non-adjacent place assimilation 


Word Target Actual 
carotte kaxot kage 
robe Kob wob 


Although superficially this is very similar to patterns of Consonant Harmony she 
displays, the timeline is not identical, and furthermore, in the case of //, there 
is no directionality restriction (that is, /&/ can receive its PoA from either the 
left or the right). This, Rose (2000) proposes, is because Clara's rhotic is devoid 
of any PoA of independently, reflecting the cross-linguistically unmarked case. 
Clara's /e/ behaves in all respects like a rhotic, whereas Théo’s represents the dual 
identity of the segment in the environment language. This begs the question 
of what underlying representation Théo has, other than the obvious [dorsal] 
feature, particularly in terms of Manner features. 


4. A tentative acoustic study 


'Ihe different acquisition patterns of Clara and Théo closely resemble the 
dual nature of the French rhotic: it is both liquid-like and fricative-like. It is 
especially interesting why Théo would posit a place-bearing rhotic, since there 
is no phonological evidence (e.g. from spreading) for this in his input, even 
though the acoustic evidence is potentially misleading — although we have 
seen that place — bearing rhotics are cross-linguistically not ruled out. Thus, the 
question arises whether 'Ihéo is acquiring a rhotic in the phonological sense, 
or whether he is hypothesizing a fricative with highly marked phonotactic 
properties. Hence, we set out to investigate the acoustic characteristics of both 
children's rhotics. 
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4.1 Items 

From both Clara and Théo, 40 tokens of faithfully produced prevocalic? rhotics 
were selected, from both singleton and cluster onsets. Since the recordings are 
all of spontaneous speech, and made in a living room situation, not all tokens 
were suitable. Unsuitable tokens were those in which, during the period in 
which the rhotic was uttered, another voice was audible, background noise was 
present, someone present at the session apparently touched or breathed into the 
microphone, or where microphone hum was unacceptable. In order to avoid 
an uneven representation of rhotics produced in single words, the number of 


tokens from the same lexical item was limited to three. All tokens were studied 
in Praat (Boersma & Weenink 2012). 


4.2 Criteria 

Six criteria were used, from general to more rhotic-specific. These are listed 
below, along with a brief description of how they were applied. Some of the 
measures involve the degree to which the segment is ‘sonorant-like’, mostly 
with respect to voicing (voice and harmonics-to-noise ratio HNR). Measuring 
trillness, of course, is specific to rhotics. Two measures of PoA were also taken, 
as we might assume that a phonological specification of [dorsal] in Théo’s case 
might lead to a smaller standard deviation (because a phonological target is 
present). 

Length. The delimitations of each item was measured by exclusion; that is, the 
end of the section before the rhotic was determined, as was the start of the 
section after the rhotic. The remaining section was designated as ‘rhotic’. It was 
expected that this exclusive criterion would provide more objective results than 
any inclusive criterion. 

Voice. Whether a given token is voiced was determined on the basis of the 
presence or absence of voice bar throughout the duration of the rhotic. For the 
purposes of the present study, voicing is treated as a binary variable. 

Trillness. Even in a language like French, with its fricative-like rhotic, some 
tokens involve a Bernoulli effect induced pulse stemming from the uvula 
hitting the tongue root. In the current study, each token was inspected both 
impressionalistically and spectographically to see whether such pulses are 
present. “Trillness’ is treated as a binary variable. 

HNR. The harmonics-to-noise ratio is a measure of the amount of energy in 
the signal that is present in harmonics relative to the amount of energy in the 
signal that is not; in other words, it measures the 'fricativity of a given auditory 
segment. 


* For the purposes of the present study, pre-glide rhotics were also included. 
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F3. A characteristic of both apical and uvular trills is that they induce lowering 
of the third formant (Ladefoged & Maddieson 1996). As children's voices 
are different from adult voices, filters in Praat were adjusted to the following 
settings prior to performing measurements: the number of formants to look 
for was limited to three, and in the spectral filter the window length was set to 
0.0025. 

Center of Gravity. For those items for which no F3 could be measured, because 
there was not enough formant structure, the Center of Gravity (henceforth: 
COG) was measured instead. Ihe COG takes into account the energy 
distribution of noise and determines where it is centered. Hence, it is a measure 
of relative backness and frontness, whereby a higher COG corresponds to a 
more forward PoA. 


4.3 Results 

'Ihe power to extrapolate conclusions from any kind of statistical test on data 
from two subjects is extremely limited. The results derived from the current 
study should therefore be treated as indications rather than conclusions. Having 
said that, the most apt test for these data is the Mann-Whitney U-test, an 
alternative to the t-test that is non-parametric and allows for unequal samples. 
For the criteria for which binary measures were performed, a y?-test was applied. 
Length. On the whole, Clara's rhotics are somewhat longer than Théo's: 17.37 ms 
vs. 14.42 ms. On the other hand, she also has a larger standard deviation: 
8.32 ms vs. 5.17 ms. A Mann-Whitney U-test yielded no significant result 
(z=1.49, p».5). 

Voice. The number of voiced tokens in Clara's sample is much higher (26) than 
in Théo’s (12). This translates to a proportion of .33 for Théo and .66 for Clara. 
A y?-test was significant: y7=7.0413, p«.01. 

Trillness. Although the French rhotic is not necessarily known as a trill, trilled 
tokens do occur. There were 11 in Théo's example (proportion: 31), and 15 in 
Clara's (proportion: .39). This does not translate to a significant result in the 
y?-test: 327.2265, p».5. 

HNR. The mean HNR for Clara's rhotics in this study is 6.7601 dB (SD: 
3.9355), whereas 'Théo's mean is 3.3593 dB (SD: 4.1776). This corresponds to a 
significant difference in the Mann-Whitney U-test: Z=1.7, p«.05. 

F3. Théo’s sample rhotics are produced with a mean F3 of 4213.86 Hz (SD: 
243.38), and Clara's sample has a mean of 4304.28 Hz (SD: 350.91). 'Ihe 
Mann-Whitney U-test yielded no significance: Z-1.25, P».1. 

COG. The center of gravity in Clara's sample has a mean of 1272.29 Hz (SD: 
565.88). For Ihéo's sample, the center of gravity is somewhat higher: 1644.14 Hz 
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(SD: 817.99). This is a non-significant difference in the Mann-Whitney U-test: 
Z=-1.1, p».1. 

In this section, we looked at the acoustic characteristics of the rhotic productions 
of Clara and Théo, two children acquiring Québecois French. The two children 
display markedly different acquisition patterns, which correlate with either the 
phonotactic (Clara) or phonetic (Théo) identities of /s/. For the six criteria 
we applied, significant differences were found only for voicing and HNR. The 
findings are summarized in Table 2. The results are not unequivocal. I take this 
to mean that the children are aware of, and struggling with, the dual identity of 
/s/. In the next section, we will discuss some of the implications of this study. 


CRITERION SIGNIFICANT 
length no 
voice yes 
trill no 
HNR yes 
F3 no 
COG no 


Table 2 - Summary of the results. 


5. Discussion 


The multitude of ways in which rhotics manifest themselves, both phonetically 
and phonologically, have puzzled many linguists. As we have seen, children also 
struggle with rhotics during the course of phonological acquisition: Théo's /s/ 
triggers dorsal assimilation when it combines with coronal obstruents in onset 
clusters. On the other hand, the French rhotic is remarkably fricative-like in its 
acoustic signature, which could have caused the child to parse it as an obstruent. 
A. dorsal fricative is, of course, much less remarkable. 'Thus, the dual nature of 
the French rhotic appears to be a cause for confusion. 

Another case of dual nature is Positional Lateral Gliding as described in 
Inkelas & Rose (2008). Inkelas & Rose describe a pattern in the phonology 
of E., a child acquiring (American) English, who during a certain period does 
not produce faithful tokens of /l/. Instead, E. substitutes the glides /j/ and /w/, 
but not in a random way: in ‘strong’ positions (onsets of words and stressed 
syllables), E. substitutes /j/, whereas in ‘weak’ positions (onsets of unstressed 
syllables, codas), /w/ is inserted. This is interesting for our present case, because 
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/V/, like /&/, has a dual nature — albeit in a different way. Whereas in the case 
of rhotic, there is a conflict between its phonetic contour and its distributional 
properties, the duality in the lateral lies in the fact that it involves both a coronal 
and a dorsal gesture’. In the case of E., the dual nature of /l/ manifests itself 
in the grammaticalized patterns of a single child, whereas what we see here is 
that two children each opt for a different route — albeit not with full confidence. 
It would be interesting to investigate the data from more children acquiring 
Québécois French, to see where Clara and Théo fit into general picture. 

‘Théo’s grammar undoubtedly is not in the adult stage. Although he knows the 
features of his language, some fine tuning must ensue. If it is true that Ihéo's /g/ 
is indeed a dorsal fricative, he has two options: either he must live with the fact 
that one of the fricatives of his language has phonotactic properties different 
from the others, or he must revise the featural make-up of the segment. If 
indeed he knows the sound is a liquid, no such revision is necessary. However, in 
all cases, he must stop assimilating onset clusters. Again, two possibilities exist: 
either he must drop the dorsal specification, or ‘unlearn the rule that enforces 
the assimilation. We do not have data from the moment at which assimilation 
ceased, but we know it did so shortly after the final recording (Rose 2000:238, 
footnote 3). 

We set out to investigate whether Théo and Clara had different phonetic rhotics, 
because Théo's acquisition pattern of the segment is markedly un-rhotic-like. We 
were unable to find conclusive evidence, and so we cannot know with certainty 
what the right answer is. We can, however, attempt an informed speculation as 
to the scenario: given the hypothesis that children appear to adhere to supra- 
segmental structures to the extent that these form the basis of their substitution 
patterns, not only exemplified by E. as described above, but by many others as 
well (Chiat 1989; Pater 1997; Rvachew & Andrews 2002; Marshall & Chiat 
2003), and given the fact that children are sensitive to their native language's 
phonotactics from a very early age (Saffran & Thiessen 2003), given that cross- 
linguistically, the option exists, and given that the acoustic study, with all its limits 
and caveats in mind, gives no conclusive evidence to the contrary, I propose that 
"Ihéo's /&/ is, in fact, a rhotic — even if it is a place-bearing one. 

In this study, we examined the place of articulation specification of rhotics in a 
number of ways; we considered typological and diachronic evidence and corpus 
evidence from acquisition (as presented in Rose 2000). Finally, we set out to 
perform acoustic tests over the production data from two children who appear 
to have different underlying representations for their rhotics, presumably 


? — Incidentally the substitution pattern of E. closely follows the distribution of /l/ in his surrounding language 


(English): light /l/ in onsets, dark /l/ in codas. E. generalizes this pattern to strong and weak positions. 
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stemming from the segment's dualistic nature in their surrounding language. 
No conclusive evidence could be found in the acoustic measurements; only some 
criteria reached significance, and in the case of the PoA-criteria, the asymmetry 
goes in opposite directions in the two tests. Of course, it is possible that with 
a larger sample of children, and a larger sample of items, a more unequivocal 
picture would arise. On the other hand, it is very possible that the fact that no 
conclusive acoustic difference could be found between two subjects - who show 
phonological evidence of having different underlying representations - is simply 
a reflection of the fact that the acoustic input is the same for both children. In 
this sense, the current results are an illustration of the observation that phonetic 
measuring cannot always probe into phonological representation. 

The non-idiosyncratic relation between phonetics and phonology has been 
pointed out in many previous publications, perhaps most strongly in Substance- 
Free phonology (see Hale & Reiss 2008, for example). Furthermore, in their large- 
scale overview of studies on the acquisition of artificial phonological grammars, 
Moreton & Pater (to appear) find very little evidence for phonetic complexity as 
a factor in determining learnability. Rather, they show that structural (featural) 
complexity is a much better predictor of the relative difficulty of a learning task. In 
the current study and the works on which it builds (Rose 2000, 2003), the learners' 
systems are accredited with a certain degree of abstraction. That is, learners 
construct their representations not based on acoustic information only — which is 
in line with the conclusions in Moreton & Pater (to appear). Ihe current finding 
that different underlying representations do not necessarily lead to different 
acoustic signatures may actually reinforce the idea that phonological learning is 
abstract to a fairly high degree. In fact, when a child such as Théo apparently has 
difficulty integrating acoustic and phonotactic/distributional evidence, the effects 
are seen in the phonological behavior rather than in the corresponding phonetics. 
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Acquisition of English [4] by adult Pakistani 
learners 
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Abstract 

The paper is based on perception and production tests conducted with 90 adult 
Pakistani learners of English with the aim to study their acquisition of English [1]. 
'Ihe study is conducted in the SLM paradigm hypothesizing that learnability of an L2 
sound is proportional to the perceived phonetic distance between the target L2 and 
the corresponding L1 sound. The results show that Pakistani learners can discriminate 
English [1] from [w] and [1] but they develop strong equivalence classification between 
English [1] and the L1 [r] in their L2 phonemic inventory. 


1. Theoretical background 


Various models have been developed to account for acquisition of L2 sounds by 
adult learners. The Speech Learning Model (hereafter SLM) by Flege (1995) 
is one such model which particularly focuses on advance/experienced learners 
(Best & Tyler 2007). The model predicts a correspondence between perception 
and production of L2 sounds. According to the SLM, L2 learners produce 
sounds of an L2 in the way they perceive them (Flege 1995:239). The model 
further predicts that if a particular sound of the L2 is perceived by L2 learners as 
different from the closest L1/L2 sound(s), a new phonetic category is developed 
by the learners for the L2 sound. But, if they cannot perceive a difference between 
an L2 and the closest L1 (or L2) sound, equivalence classification between the 
two sounds (where two sounds are equated to each other) takes place which 
blocks the establishment of separate phonetic representation for the L2 sound. 
According to Flege (1995), learnability of an L2 sound is proportional to the 
perceived phonetic distance between the L2 sound and the closest sound(s) of 
either the L1 or L2. The SLM provides seven hypotheses which predict learning 
outcomes in different contexts. Out of those, 3 hypotheses which are related to 
the current study are reproduced below from Flege (1995:239): 
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1. “A new phonetic category can be established for an L2 sound that differs phonetically 
from the closest L1 sound if bilinguals discern at least some of the phonetic differences 
between the L1 and L2 sounds." 

2." [he greater the perceived phonetic dissimilarity between an L2 sound and the closest L1 
sound the more likely it is that phonetic differences between the sounds will be discerned." 
3. “Category formation for an L2 sound may be blocked by the mechanism of equivalence 
classification. When this happens, a single phonetic category will be used to process 
perceptually linked L1 and L2 sounds (diaphones). Eventually, the diaphones will resemble 


one another in production." 


Studies conducted in the SLM paradigm normally use ‘goodness of fit tests arranged 
with either monolinguals or early stage adult L2 learners to gauge how similar or 
different an L2 sound is from the closest L1 or L2 sounds. On the basis of such 
tests, perceptual mapping of L2 sounds in the phonemic inventory of learners is 
determined and predictions about expected learning pattern are made. For example, 
Guion et al. (2000) conducted an experiment with inexperienced Japanese learners 
of English to determine perceptual mapping of the Japanese learners for English 
consonants. Levy (2009:2680) developed a "cross-language assimilation overlap 
method" which assumes that the percentage of overlap between L1 and L2 sounds 
in the perception of monolingual speakers of the L1 of a group of learners may be 
used to determine the perceptual distance between the L2 and the corresponding 
L1 sounds. In this study (Levy 2009) the results obtained with one group of subjects 
were used to develop hypotheses for other groups of L2 learners. 

‘The current study focuses on perception and production of English [1] by adult 
Pakistani learners who speak Saraiki as L1. Saraiki is an Indo-Aryan language 
spoken in central Pakistan (Shackle 1976) which has a rolled [r] with phonemic 
aspiration contrast. (See the phonemic inventory of Saraiki in Appendix A) 
Saraiki [r] has been defined by Varma (1936:80) in the following words: 


"[r] is a rolled consonant generally accompanied by two rapid taps of the tongue against 
the teeth-ridge [...]. In the initial position as in [ris (oris)] ‘envy’, it often tends to begin 


with a vocalic on-glide and sounds somewhat like [or]." 


Saraiki [r] is produced as a trill in stressed syllable, emotional speech or in some 
rural dialects. There is a free variation in Saraiki between rolled [r] with two taps 
and trilled [r] with continuous taps. 
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2. Hypotheses 


In order to develop hypotheses on the expected pattern of learning in light of 
the predictions of the SLM, we need to calculate perceptual distance between 
English [1] and the closest L1 and L2 sounds. The distance was calculated on 
the basis of overlapping in perception of Saraiki monolinguals following the 
"cross-language assimilation overlap method" (Levy 2009). For the purpose, a 
perception test was conducted with 10 Saraiki monolinguals. The experiment 
was based on two discrimination tasks. The first was a 3 alternative forced choice 
(3AFC) discrimination task. In this task, the participants were asked to listen 
to three sounds and determine if any two of those were similar. The instructions 
were given to the monolinguals in the L1. There was one trial for each of the 
following set of stimuli used in this test. The following nonsense syllables of 
English sounds spoken by a female native speaker of English (aged 27) were 
played in the following sequence: 


1. [ala], [ana], [aza] 
2. [ara], [awa], [aja] 


‘The purpose of this test was to assess whether the Saraiki monolinguals 
assimilate English [1] to [1], [w], [n] or [j]. In the discrimination of the [1], [n] 
and [1] set, out of total 10 participants, 4 participants assimilated [1] with [1] 
while 6 did not assimilate it with [1]. None of the monolinguals assimilated [1] 
with [n]. In the set of stimuli which carried [1], [w] and [j], 4 monolinguals 
discriminated [1] from [w j] accurately. The remaining 6 assimilated [1] with 
[w]. None of them assimilated [1] with [j]. Thus the 3AFC discrimination test 
shows that the Saraiki monolinguals perceptually assimilate English [1] with 
[1] and [w] but not with [j] or [n]. The sounds [w j 1 n] exist in the phonemic 
inventories of both Saraiki and English. 

The second part of the experiment was an AX discrimination task in which a pair 
of VCV stimuli was played to the monolinguals who were asked to determine 
whether these sounds were the same or different. The first member of the set of 
stimuli was a nonsense syllable [ara] comprising of Saraiki [r] with low vowel [a] 
on both sides spoken by a female native speaker of Saraiki (aged 39) and the second 
one was English [asa] spoken by a female native speaker of English. Each of the 
stimuli had three repetitions in this test. The purpose of this test was to see if the 
Saraiki monolinguals could perceive a difference between English approximant [1] 
and the L1 rolled [r]. Out of 10 monolinguals, only two discriminated English [1] 
from the L1 [r] in all three trials consistently and 2 of them discriminated it in one 
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out of three trails. Thus the total percentage of accurate discrimination was 26.7% 
while 73.3% of the time the monolinguals assimilated English [1] to the L1 [r]. The 
overall results of the experiment are summarized (in percentage) in Table 1 below. 


Test | STIMULI DISCRIMINATION ASSIMILATION Tora 
English [1] & English [I] 60 40 100 

3AFC | English [1] & English [w] 40 60 100 
English [1] & English [j n] 100 - 100 

AX L1 [r] & English [1] 26.7 73.3 100 


Table 1 — Perception test results with Saraiki monolinguals (in percentage). 


Table 1 shows that Saraiki monolinguals perceptually assimilate English [1] 
with the L1 [r] 73.3% of the time while 26.7% of the time they discriminate it 
from the L1 [r]. And the 3AFC test shows that they assimilate English [1] with 
English [w] and [1] 60% and 40% of the times, respectively. Following the idea of 
overlap between sounds (Levy 2009) we assume that there may be a maximum 
of 73.3% overlapping between English [1] and Saraiki [r], 60% overlapping 
between English [1] and [w] and 4096 overlapping between English [1] and [1] 
in the L2 phonemic inventory of the Saraiki learners of English. On the basis 
of these results we develop the following hypotheses about expected learning 
pattern of Pakistani learners of English: 


1. The Pakistani learners of English will acquire English [1] accurately because 
they are likely to discriminate English [1] from the closest sounds. 

2. Alternatively, they will either assimilate it to [1], [w] or the L1 [r] with the 
directionality of difficulty of discrimination (from least to most difficult) as 
follows: 


[1]  [w]  [r] 


‘Thus, if Saraiki learners can discriminate between English [1] and Saraiki rolled 
[r], they will acquire the English [1]. The likelihood of this is a maximum of 
26.796 according to the perceptual mapping of the Saraiki speakers of English 
[1] based on the monolingual test. If a difficulty is experienced, the interfering 
sounds are likely to be [1w r] with varying levels of interference as determined by 
the monolingual tests discussed above. To test these hypotheses, we conducted 
an experiment which is detailed in the following section. 
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3. Research methodology 


Perception and production tests were conducted with 90 adult Pakistani 
learners of English to test the hypotheses developed in section 2. The perception 
test comprised an AX discrimination task, two 3AFC discrimination tasks and 
an identification task. The 3AFC tasks and AX discrimination task followed 
the same procedure as with the monolinguals discussed in section 2. In the 
identification task, the stimulus [a1a] spoken by the native speaker of English 
was played to the participants who were asked to write down in English and 
Urdu on a given answer sheet what sound they heard between the two vowels. 
‘They were further informed to point out if they think that the sound they heard 
did not match with any of the existing graphemes of Urdu and English. See 
Appendix for answer sheets. 

The production test comprised a word-reading task. The target word was reach 
which the participants read along with some other words. Each of the words 
was read three times by each of the participants. The other words included in the 
list of the stimuli were distracters so the participants did not have an idea of the 
purpose of the test. The readings of the participants were recorded and out of the 
three repetitions, the best quality recording was provided to four native speakers 
of English who evaluated these productions on a Likert scale given below: 


CRITERIA Marks 
Native-like 5 
A little deflected away from native-like 4 
Different from natives but understandable 3 
Hardly understandable 2 
Unintelligible 1 


Table 2 — Scale of marking used by the native speakers. 


A cut off point of 4 on the scale is set as indicative of near-native production. Thus 
any production ofthe target sound that gets a score of 4 or above will be considered 
as a correct production of the target sound. A score of 4 (not 5) is considered the 
cut off point for learning because it is extremely rare for the adult L2 learners to 
acquire quite native-like production. That is why the SLM also predicts that a 
new phonetic category for an L2 sound established by an adult learner may be 
deflected away from that of monolinguals of the L2 (Flege 1995:239). 


45 


46 


Nasir A. Syed 


3.1 Participants of the study 

‘Three groups of learners were selected for this study with the goal of evaluating 
whether continued exposure improved learners' production and perception of 
English [1]. In Pakistan English is taught as a compulsory module to students 
from primary to Bachelor's level and is used as the medium of instruction in 
many disciplines at post secondary level. All groups involved advanced learners 
who had been learning English for at least 14 years but they differed with 
respect to whether they (a) actively used English, (b) specialised in English at 
MA level, or (c) had exposure from English native speakers. Group (i) consisted 
of 30 educated adults based in Pakistan who were all graduates from Pakistani 
universities specialised in non-linguistic/English language courses. This group 
only uses English for academic purposes or for official correspondence. Thus we 
call them ‘Inactive Learners’ of English. Group (ii) consisted of 30 students of 
MA English studying English language, linguistics and literature in Pakistan. 
In the following discussion we shall refer to this group as 'Student' group. Group 
(iii) consisted of learners based in Essex (UK) who left Pakistan after getting 
their first degree from Pakistan. They will be referred to as UK-based learners 
in the following discussion. 

‘The participants of all groups originate from the same area; all speak Saraiki 
as L1 and all studied in similar type of institutions in Pakistan. The purpose 
of including the UK and Student learners in the study is to assess the role 
of native-input in the former and that of the active learning in Pakistan in 
the latter group in acquisition of English [1]. The performance of the Inactive 
Learners will be used for comparative analysis as all groups of learners were 
similar up to BA level. Afterwards, the Student group went to MA English 
courses and the UK group came to England. ‘Thus, the better performance of 
the Student learners vis-à-vis the Inactive Learners will be ascribed to their 
active learning of English in Pakistani universities. Similarly, any improvement 
noted in the UK group vis-à-vis the Inactive Learners will be ascribed to the 
input that the former are getting in the UK. 


3.2 Stimuli 

‘The stimuli were recorded in the voice of a female native speaker of English in 
a psycholinguistic laboratory of University of Essex. The target consonants were 
recorded with a low vowel on each side i.e. [ara] etc. The stimulus for Saraiki [r] 
was recorded in the same form i.e. [ara] in the voice of female native speaker of 
Saraiki. These stimuli were used in the perception test. The methodology used 
for these tests was the same as discussed in section 2. 
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4. Presentation of data 


In this section the results of the perception and production tests are presented 
separately. The perception test results are presented first followed by the 
production test results. 


4.1 Perception of [1] 

As mentioned above, the perception test consisted of an identification task, an 
AX discrimination task and two 3AFC discrimination tasks. Table 3 shows the 
perception test results in percentage. The results show that in the identification 
task and in the 3AFC-1 task the UK group performed better than the 
Student group who in turn performed better than the Inactive Learners group. 
However, in the 3AFC-2 discrimination task, the performance of all three 
groups is equally good. In the AX discrimination task, Inactive Learners group 
performed better than the other two groups in contrast to the trend seen for the 
identification and 3AFC-2 discrimination task. However, overall performance 
of all the groups is poor in the AX discrimination test. A non-parametric test 
confirms the group variance as statistically significant in the identification task 
(2-17.603, p«.001), the 3AFC-1 discrimination task (4?=13.075, p«.001), 
and the AX discrimination task (y?=9.068, p«.01). The increasing trend in the 
performance of the groups is also significant (p«.001). However, group variance 
in the 3AFC-2 discrimination task is non-significant (p».1). 


IDENTIFICATION DISCRIMINATION 
Group [ara] 3AFC-1 [rwj] | 3AFC-2 [rnl] AX (L1/z-L2 /1/) 
UK 93.33 93.10 93.33 33.33 
STUDENT 788.9 80.00 83.33 26.67 
INACTIVE 
EARNERS 50.00 53:33 86.67 53.33 


Table 3 — Accuracy (in percentage) in perception test. 


4.2 Production of [1] 

‘The production test was based on a word-reading task. Four native speakers of 
English evaluated the productions. The overall reliability in evaluation by the 
judges was 62% (Cronbach's alpha=.622). The following are the average scores 
obtained by the participants for the production of English [1] in the word reach. 
‘The standard deviations are given in parentheses. 
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Group MEAN SCORE 
UK 3.72 (.60) 
STUDENT 3.68 (.32) 
INACTIVE LEARNERS 3.41 (.48) 


Table 4 — Average scores in the production of [r]. 


A one-way ANOVA shows significant group variance (F287=7.165, p«.001)! 
but a post-hoc analysis only confirms variance between UK and Inactive 
learners (p«.001). The results show that the learners did not perform well in 
this test. None of the groups could obtain an average score of 4 which was 
fixed as a minimum cut off point for learning. Although the scores only point 
out the relative performance of the participants in the production of the target 
sound (not the actual nature of the consonant produced by the participants), 
later acoustic analysis shows that the learners produced English [1] as L1 rolled 
[r]. The results of perception and production test are analyzed and discussed in 
the following section. 


5. Analysis and discussion 


'Ihe production test results show that the learners have very poor production 
of English [1] as in the production test none of the groups of learners could 
obtain an average score of 4 which is the cut off point for considering them 
as having acquired the target sound. The perception test results show that the 
performance of all groups including the Inactive learners is excellent in the 
discrimination of [1] from [I] which indicates that the learners can discriminate 
English [1] from [1] from early stages of learning. The reason of including [1] - 
[r] contrast in the perception test was to evaluate how well Pakistani learners 
can discriminate the two sounds since previous research on some L2 learners 
of English has shown perceptual assimilation of [r] with [1] (e.g. Brown 1998, 
2000; Flege et al. 1996; Larson-Hall 2004). In the identification and 3AFC-1 
discrimination tests, the UK and Student participants performed better than 
the Inactive learners. In 3AFC-2, all three groups performed equally well. 
The 3AFC-1 test was based on discrimination between [1] and [w j] and the 
3AFC-2 was based discrimination between English [1] and [1 n]. It means both 
the Student and UK learners have learnt to discriminate [1] from [j w 1 n] and 
the Inactive group has learnt to discriminate it from [1 n]. However, in the 


1  AKolmogorov-Smirnov test confirms the normal distribution of the data (p».05). 
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AX discrimination test, all participants are poor as they cannot perceive the 
difference between English [1] and the corresponding L1 [r]. 

This performance of the learners corresponds with that of the Saraiki 
monolinguals who also assimilated English [1] with [I], [w] and L1 [r] (see 
Table 1). However, the results show that the L2 learners are faced with the 
difficulty to acquire English [1] in the initial stages but some learning must have 
occurred which reflects the improved performance of the 3 groups of learners. 
The performance of the 3 groups reveals a particular directionality of learning. 
The Inactive group who have the least use of English have learnt to perceive 
the difference between [1] and [1] but are not able to discriminate English [1] 
from [w] and L1 [r]. The UK and Student groups learnt to differentiate English 
[1] from [I] and [w] with an accuracy of 8096 or above (see Table 3). The two 
groups could however not discriminate between English [1] and L1 [r] and only 
have an accuracy rate of «3496 for this contrast. These subjects performed well 
in the identification task and the 3AFC discrimination task because these tasks 
involved their ability to differentiate English [1] from all the other consonant 
sounds of English. But the AX discrimination task results show strong 
equivalence classification between English [1] and L1 [r] in the L2 phonemic 
inventory of these learners. As a result they produced the approximant English 
[1] as a rolled [r] as in the L1 (explaining the poor scores they received in the 
production task). The overall results show a clear learning pattern with respect 
to the discrimination of English [1] from [w], [1] and L1 [r]. The directionality 
of difficulty for the learners (from least to most difficult) is as given below: 


[1] > [w] > L1 [r] 


Thus Pakistani learners first learn to discriminate English [1] from [1] (as the 
performance of all participants shows) followed by the discrimination of [1] 
from [w] based on training and greater input (see the performance of the 
Student and UK group). The greatest difficulty comes from the discrimination 
of English [1] from the L1 [r] which even the UK-based group with the input 
from native speakers cannot overcome. The most advanced Pakistani learners 
are therefore only able to develop separate representations for English [1] from 
[w] and [1]. We can depict the emerging learning process in the 3 groups in 
Figure 1. 
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Student 


Inactive 


Figure 1 — Development of discrimination between L2 [1] and L1 [w]. 


The above figure shows that in the L2 phonemic inventory of the Inactive 
learners [1/r] and [w] overlap to a large extent while this is less so in other 
groups who manage to separate the two sounds and mainly treat them as 
separate categories. Ihe UK group fairs best in the separation while the Student 
group can be predicted to show more variable discrimination because of the 
higher overlap. 

The above results are based on collective group performance. If we consider 
individual performance and use 4 as the near native-like performance cut off 
point in the production test then there are 3 UK-based participants who have 
a near native-like performance in the production and perception of English 
[1]. These 3 participants perceived English [1] accurately in all repetitions of 
all the perception tasks and also obtained a score of 4 in production task. We 
can conclude that only 3 UK-based participants developed an independent 
phonetic category for English [1]. This is illustrated in the following figure 
which contains two spectrograms of the word reach as produced by one of the 3 
native-like participants (left spectrogram) and by another participant who is as 
yet unable to discriminate between English [1] and L1 [r] (right spectrogram). 


[r] B6] [9] [e] [r] ii] — [ff] 


Figure 2 — Spectrograms of the word ‘reach’. 
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The left-hand spectrogram shows that the participant who was able to 
discriminate English [1] from the L1 [r] produced the word reach with 
an approximant gesture word-initially but the participant who could not 
discriminate between English [1] and the L1 [r] produced the English [1] in the 
word reach with a tap or trill as the right hand side spectrogram shows. Besides, 
on the pattern of the L1 [r], the participant has also added a vocalic gesture in 
the beginning of the word reach virtually producing the word reach as [orit]. 
This demonstrates that most of the learners could not acquire approximant [1] 
in English; some of them even failed to suppress the epenthesis of initial vocalic 
gesture in the words of English starting with [r] (a phenomenon transferred 
from the L1). The epenthetic vowel in the beginning of the word reach produced 
by the participant is clearly reflected in the following waveform highlighted in 
a rectangular box. This is an example of negative transfer from the L1 as a result 
of a strong equivalence classification between L2 [1] and L1 [r]. 


Figure 3 - Waveform of the word reach by one of the participants. 


6. Conclusion 


This paper reported on an experiment that whether Pakistani learners of 
English will acquire English [1] accurately or assimilate it with [w], [1], or the 
L1 [r]. The results show that although there has been some progress in the 
acquisition of English [1], the learners have not accurately acquired English [1] 
even though there are individual participants who show that such acquisition is 
possible. On the basis of the results from the 3 groups we are able to map out a 
clear developmental path in the discrimination of English [1] from the closest 
sounds namely [1], [w] and L1 [r]. The group with the least exposure to English 


post classroom learning (Inactive Learners) show the least acquisition and are 
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only able to discriminate English [r] from [1]. The intermediate group in terms 
of exposure (Student learners in Pakistan), who because they have specialised in 
English at MA level have a higher English usage than the first group, are better 
at the discrimination of English [1] from [1] and can also discriminate it from 
[w]. The most advanced group in terms of more systematic day to day exposure 
to English in the UK have overall better results even though they still fall short 
of the accurate acquisition of English [1]. 

The overall developmental path attested is parallel to the performance of the 
Saraiki monolinguals who showed a variation in the discrimination of English 
[1] from the closest sounds, with accuracy gradually declining from [1] (60%) 
to [w] (40%) to [r] (26.7%). This verifies the idea of the SLM that “the greater 
the perceived phonetic dissimilarity between an L2 sound and the closest 
L1 sound, the more likely it is that phonetic differences between the sounds 
will be discerned". The SLM is further supported by an individual analysis of 
the results which shows that the three participants of the UK-based group 
who could perceive a difference between English [1] from the closest sounds 
including L1 [r] are also able to produce English [r] accurately. 

‘There are two outstanding issues. The first one is regarding why the Inactive 
Learners group with the least English exposure performed better than the 
other two groups in the AX discrimination task (see Table 3). 'Ihis may be 
better considered not within the framework of second language acquisition, but 
within a sociolinguistic one. It might well be that people employing English 
for professional purposes are more aware of the difference between their own 
and the native pronunciation, without being able to reproduce it. In this respect 
‘Inactive learners’ are really inactive in their production, i.e. fossilized with respect 
to the other two groups currently exposed to different kinds of input, that they 
cannot produce the L2 sound different from the closest L1 sound although most 
of them perceive the difference between the L1 and L2 consonant. 

The second issue is that of the insertion of an epenthetic vowel in the beginning 
of the words starting with [r] in Saraiki and its implications in the acquisition 
of L2. In this regard my point of view is that at some stage of its historical 
development Indo-Aryan languages did not accept word initial consonants 
(Masica 1993). At that stage all words started with vowels. Later on, it started 
accepting consonants word-initially but as a remnant of the old traditions the 
speakers added some vocalic gesture or schwa like insertion in the beginning 
of the words starting with consonants. Epenthesis of vowel before sonorants 
and strong pre-voicing in obstruents in Indo-Aryan languages like Saraiki is a 
remnant of that period of the language history. However, both these issues need 
further investigations and are left for future research. 
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Appendix A: Phonemic inventory of Saraiki? 


A sl IE 
5 E a T ae, 
$ E 5 S á 5 3 8 © 
= El S. El [e] ER "E & d 
e 5 E RE E a ES x: [m 
5 " ed 
E. 
plosive - - p t t c k 
- " p e t ch ke 
$ - b q q j g 
F n b^ d d p g^ 
implosive b d f 
fricative - f s f 
+ Z y fi 
nasal + 7 m n n p 0 
" " mb ph EU p 
flaps - f t 
" E p 
lateral - 1 
pn p 
approximant - v j 
+ v^ 


2  Shackle (1976:18) does not include the breathy voiced alveo-palatal nasal in the consonantal inventory of 
Saraiki but the sound does exist in the language. Examples are words like, /kap^à/ ‘late’ and /máp'ar/ 'cas- 
trated’. 
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Appendix B: Answer sheets 


1: Answer sheet for the identification test 


Instructions for the participants: You will listen some consonants of English each 
flanked by a long vowel [a] on both sides. After listening the consonants, just note in the 
blank space provided in the sheet the consonant you have heard between two a's. Also 
note the same consonant in Urdu in the next column. If the sound does not exist in either 


of the languages, please point out in column three of the sheet. 


S.No. Consonant Corresponding letter in Urdu Remarks 
1 AA sven aa 
2 da einder aa 
3 AA sati aa 
4 

5 

6 

7 MA natens aa 
8 Ads ese aa 
9 Adere aa 
10 CE dna aa 


2: Answer sheet for 3AFC discrimination test 


Instructions for the participants: First the target sound will be played. After a pause 
a pair of sounds will be played. If the first sound of the pair matches the target, tick in 
column A of the answer sheet, if the second one matches the target sound tick in column 


B and if neither of the sounds matches with the target sound, cross (x) in column C. 


S. No. Column A (1) Column B (2) Column C (x) 
1 


2 
3 
4 
5 


55 


Nasir A. Syed 


3: Answer sheet for the AX discrimination test 


Instructions for the participants: Please listen to the pairs of sounds and determine if 
the consonants in the sounds are identical or different by ticking in the relevant column. 
Please ignore the difference in tone, pitch and intonation of the speakers and decide only 


on the basis of the consonant between two vowels. 


S. No. Identical Different Remarks 
1 


On rhotics in a bilingual community: 
A preliminary UTI research 


Lorenzo Spreafico & Alessandro Vietti, Language Study Unit, 
Free University of Bozen-Bolzano 


Abstract 

In this paper we offer an Ultrasound Tongue Imaging (UTI) based description of 
rhotics in bilingual speakers from South-Tyrol. In particular we examine whether adult 
Italian/Tyrolean bilinguals display differentiated patterns of articulation for rhotics in 
each language they speak and whether bilinguals’ articulatory patterns in each examined 
language are similar to those used by almost monolingual speakers or not. Intraspeaker 
comparison shows that very late sequential bilinguals do not present distinct articulatory 
patterns for rhotics in the two languages, while the simultaneous bilingual do. Besides 
interspeaker comparison shows that articulatory patterns for rhotics used by simultaneous 
monolinguals differ from those used by the very late sequential bilingual speakers. 'This 
data helps to understand how phonological categories are organized by bilinguals, and 
tackles the long debated issue regarding the possibility that bilinguals make use of a 
single shared phonological system or of two separate ones. 


1. Introduction 


1.1 Background 

This study! is part of a project aimed at collecting a socially-stratified articulatory 
corpus using the UTI technique. The participants included in the database are 
bilingual speakers of Italian and of Tyrolean as they are spoken in South Tyrol. 
From a sociolinguistics point of view, South Tyrol is characterized by a societal 
bilingualism with two quite separate linguistic communities: Tyrolean and 
Italian. These two communities exhibit marked asymmetries in their linguistic 
repertoires (Table 1). The linguistic repertoire of the members of the Tyrolean 
community is characterized by a medial diglossia, with Tyrolean — a southern 
Bavarian dialect (Wiesinger 1989; Barker 2005) — in lower position, and 
Standard German in high position (Ciccolone 2010; Lanthaler 1990). Moreover 
the repertoire of the German community very often includes Italian, especially 
if speakers with middle-high level of education and living in main towns such 
as the capital city Bozen-Bolzano are considered. 
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In contrast, the members of the Italian community are not markedly bilingual 
with respect to Tyrolean, although they are likely to display discrete competence 
in an Italo-Romance dialect, — especially if they are of an older generation — or 
in Standard German, especially when they belong to the younger community 
and learnt it in school. 


'TYROLEAN COMMUNITY ITALIAN COMMUNITY 
L1 L2 L1 [2 
H Standard Standard Itali (Standard 
dd German Italian iis German) 
(Bozner Regional 
NEDER Deutsch) Italian 
(Italo-Romance 
Low Tyrolean dialect) 


Table 1 — Linguistic repertories in South Tyrol. 


1.2 Rhotics in South Tyrol 

What are the consequences of this situation on the phonetics and phonology of 
Italian and Tyrolean as they are spoken in South Tyrol? Unfortunately research 
on this topic is scant and actually limited to one volume (Tonelli 2002). Even 
scanter however are investigations offering data on rhotics. As for Italian spoken 
in the area we can refer to auditory investigations by Mioni (1990, 2001), 
Canepari (1990), Tonelli (2002) and to instrumental investigation by Vietti, 
Spreafico & Romano (2010), Spreafico & Vietti (2010), Vietti & Spreafico 
(2010) and Spreafico & Vietti (2011). As for Tyrolean, interesting exceptions 
are Klein & Schmitt (1969) and again Tonelli (2002). 

Mioni's (1990) investigation limits itself to the utterances in Italian produced by 
informants living in the cities, and in particular it focuses on monolingual and 
bilingual students. As regards rhotics in Italian monolinguals, he affirms that 
the apicoalveolar tap usually prevails. As for the bilinguals, the author reports 
that all his informants (with no significant distinctions) use some sort of uvular 
rhotic, which, as far as he is concerned, reveals an influence of the Bavarian 
dialect substratum and, in a way, indexes speakers’ ethnicity!. In contrast, on the 
basis of auditory analyses (of supposedly monolinguals' utterances only) Canepari 
(1990) reports on the tendency of using uvular pronunciation (e.g. [R; x]), which 
at times can even be accompanied by alveo-uvular pronunciations. Yet Tonelli 


! — This becomes even more evident if one takes into account that, as reported by Mioni (2001), the Italian 


phonology in these informants is properly acquired and it is substantially the same as the one used by the 
Italian native speakers around them. 
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(2002) shows that the only variant of the /r/ sound to be found in an Italian 
sample (again comprising monolinguals only) living in Bolzano is [r], which is 
sometimes, and in marked pronunciation only, replaced by [r]. 

Vietti & Spreafico (2010) offered a different picture of this phenomenon. They 
acoustically analyzed the type of /r/ realizations in Italian productions by South 
Tyrolean informants and pointed out that sometimes both apical and uvular 
realizations can be detected in utterances and even in isolated words produced 
by the same informant. They examine a sample of 11 speakers and about 500 
occurrences and show that their informants make use of many more allophones 
than those documented in previous research: [r]^; [op] Ls]; [r]; bs [r]; [t]; [z]; 
[x]; [š]. In addition, they identify several instances of deletion, as well as other 
phones that could be hardly categorized mostly due to the fact that the acoustic 
and auditory data were contradictory. 

Systematic research on rhotics in South Tyrolean is sparse and limited to the 
information provided by the Tirolischer Sprachatlas (Klein & Schmitt 1969). As 
for the analysis of the data including /r/ realizations in Klein & Schmitt (1969), 
it is worth noting that an extremely relevant diatopic variation emerges and that 
salient differences emerge across the broader area of South Tyrol*. For example 
the analysis of some of the maps in the volume on Konsonantismus, Vokalquantitàt, 
Formenlehre for the capital city of Bozen-Bolzano shows that uvular articulations 
are registered in six out of nine cases’, while apicoalveolar articulations are reported 
for the rest. The alternation among front and back realizations seems also to affect 
the so-called Bezner Deutsch, which, according to Tonelli (2002) is characterized 
by [r] and exceptionally by [r]. These observations are consistent with those 
reported in studies on bordering areas as in the case of Ulbrich & Ulbrich (2007) 
who remarks on Austrian German: they note that the spectroacoustic analysis of 
newsreaders' productions reveals a prevailing use of uvular realizations in onset 
position (especially [r] and [p], but also [y] and [s], which may be due to backing 
phenomena) and mainly vocalized variants of /r/ in coda position, although not 
excluding apical articulation. 


Both tap and — to a lesser extent — flap [r]. 


Uvular tap. This sound, unknown in the IPA, is transcribed by the symbol [p] according to a proposition 
made by Demolin et al. (ms). 


E.g. deletions and apical realizations in the Western Pustertal versus uvular trills in the Easter Pustertal. 


Uvular articulations are reported for: Durst, map 50; Wurst, map 51; Werden, map 58; Hertz, Fertig, Wird, 
map 54. Apicolaveolar articulations are registered for: Fewer, Bauer, Bauertag, map 91. It is worth noticing 
here that there seems to be an isogloss running NE-SW along the Eisack Valley separating /r/ dialects in 
the West from /r/ dialects in the East. 
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The brief discussion offered above clearly shows the lack of systematic 
investigation of both Italian and Tyrolean dialect with respect to rhotics. 
‘Therefore, this research also contributes to fill the gap as it offers a preliminary 
instrumental description. 


2. Methods 


2.1 Informants 

In order to answer the research questions on whether adult Italian/Tyrolean 
bilinguals display differentiated patterns of articulation for rhotics and on 
whether pattern of articulation in adult bilinguals are similar to those by 
monolingual speakers we collected a socially-stratified articulatory corpus using 
the UTI technique (Stone 2005; Iskarous 2005; Davidson 2012). 

The nineteen informants included in the database are bilingual speakers of 
Italian and of Tyrolean as spoken in South Tyrol. They are all in their mid 30's 
and were born and raised in Bozen-Bolzano, the capital city of South Tyrol. 
Initially a questionnaire was used to determine the participants’ length and 
amount of exposure to the two languages. Building on that each informant was 
assigned to one of four groups on a bilingualism discretum scale: simultaneous 
bilinguals, early sequential bilinguals, late sequential bilinguals and very late 
sequential bilinguals. 

‘This was mostly on the basis of two parameters: the rate of bilingualism in the 
family, that is whether the informant's parents were native speakers of the same 
language or not, and the rate of dual language exposure, in other words whether 
the informant had been in contact with Italian and the Tyrolean dialect from 
birth, from nursery school on, from primary school on or from secondary school 
on only (as shown by Simonet 2010 for Catalan)*. 

In order to control for the real exposure to the two languages and to obtain a 
betterunderstanding ofthe sociolinguistic mi/ieu and hence ofthe sociophonetic 
environment each informant was inserted into (Khattab 2002), we collected 
social network data for each speaker using an egocentric approach which 
examines individuals immediate neighbors and associated interconnections 
(Milroy & Milroy 1985; Scott 2000). This allowed us to assess the amount 
of Italian or Tyrolean each speaker was exposed to and actually resorted to in 
his/her daily life. 


6 — [tis important to notice here that South Tyrol has a split school system with segregated Italian and Ger- 
man schools and that in the latter case lessons are supposed to be taught in Standard German and not in 
Tyrolean. This means that in South Tyrol, Tyrolean can be acquired via spontaneous interactions only, 
whereas Italian can also be learnt via formal instruction. 
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By linking the data from the questionnaire and those from the egocentric social 
network we were able to include into the corpus 8 simultaneous bilinguals, 3 
early sequential bilinguals, 4 late sequential bilinguals and 4 very late sequential 
bilinguals. 

In this paper we only focus on the analysis of rhotics as they are articulated 
by seven speakers out of the nineteen we recorded, namely those belonging 
to the opposite poles of the discretum (see Table 2): two very late sequential 
bilinguals (LSB) and five simultaneous bilinguals (SB). Each of the very late 
sequential bilinguals grew up in strictly monolingual families: an Italian (LSB1, 
female) and a Tyrolean (LSB2, female) respectively, and according to data from 
their social network at the time of our recording, had almost no contacts with 
members of the other language community. 

On the other hand the simultaneous bilingual speakers SB1 (male), SB2 
(male), SB3 (male), SB4 (female), SB5 (female) came from bilingual families 
(in the sense that each of their parents was a native speakers of one of the two 
languages), attended both Italian and German schools, and, according to their 
egocentric network, kept up relationships equally with members of the two 
language communities. 


Z 
d 3 Ex 
[e] = m 
Ea |© 
z ME AE 
5 e Z 3 Z z z M * 
g à 3 z z E s d 3 
E RNN NE NE: 2 | & | & 
a) < O & É O 5 i E = 
LSB: |23 F 93 0 0 0 93 
LSB2 |24 F 7 87 7 0 0 101 
SBr [31 M 47 13 0 13 13 0 86 
SB» {21 M 80 7 0 0 y 101 
SB3 138 M 40 47 0 0 0 94 
SB4 |41 F 87 0 0 0 0 0 87 
SBs |22 F 80 0 0 13 7 0 100 


Table 2 — Speakers' rate of interaction (96) in each language or combination of languages for 
their last 10 encounters during the day of data collection. Information retrieved via the 
EgoNet software (McCarthy 2011). *Not all logical combinations reported, total might differ 
from 10096. 
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2.2 Procedure 

For data collection, we used the Articulate Instruments multichannel acquisition 
system called Articulate Assistant Advanced (AAA) (Articulate Instruments 
2011). 

Articulatory data was recorded using a portable SonoSite 180 ultrasound 
machine equipped with a SonoSite ICT intracavitary array transducer operating 
at 4-7 MHz. The frame rate was automatically and unchangeably set at 15 Hz; 
the depth was autonomously set at 7 cm; the field of view was 120°. The probe 
was held by a stabilizing helmet to make sure that it adhered to the speaker's 
chin and was kept in constant relationship to the speaker's palate. 

Acoustic data was recorded at 22,050 Hz using a Marantz PMD660 recorder 
coupled with a Beyerdynamic MCE86N microphone. The audio signal exiting 
from the recorder was synchronized to the video signal coming from the 
ultrasound machine via the SyncSyncBrightUp™ (Articulate Instruments 
2011). This device was triggered by an audio beep generated by AAA upon 
pressing the start recording button. The software then superimposed a white 
mark on the video signal and generated a sync pulse used to synchronise the 
audio and video signal during the analysis. 

Overall 38 written prompts were presented to each informant via a PC monitor. 
At first two test words were presented to the speakers to acquaint them with the 
procedure. Then two word-lists were presented to the participants, one in Italian 
and one in Tyrolean’. Each list contained 18 randomly arranged target words 
beginning with a CRV sequence of the kind: plosive plus rhotic plus high or 
low vowel (see Appendix 1). These sequences were chosen to control the high 
contextual variability of /r/ already observed in Vietti & Spreafico (2008) so to 
allow a better comparison of static articulations in the two languages; as well as 
to allow an analysis of coarticulation phenomena in onset clusters?. 

In addition to the target words, each list contained two distractors used to 
urge informants into swallowing some water or eating some pudding. That 
was needed to collect palate images of a decent quality that could serve as 
reference for the subsequent analysis. Each sequence of written prompts was 
submitted in the same order to the informants three times, so in the end we 
were able to record 114 words for each speaker. That was needed to ensure that 
notwithstanding the slow and unalterable scan rate of 15 Hz at least one image 
of the tongue during the short constriction phase could cleanly be imaged for 
each of the eighteen CRV sequence in the two languages. 


7 Since there are no common writing conventions for Tyrolean, which are inherited and customized from 


Standard German, informants were allowed to examine a printed copy of the word list before the test to be 
sure they would recognize all forms it contained. 


We leave this matter for future research. 
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All speakers were individually recorded in a soundproof room, and whenever 
possible two researchers at a time attended the data collection session and 
interacted with the informants. This was arranged to assure that both a native 
speaker of Italian and a native speaker of Tyrolean were present at the same 
time so to ensure a truly bilingual environment and have the informant in the 
bilingual mode (Grosjean 1998). 

For data analysis, we ran a parallel auditory’/articulatory analysis based on the 
audio records and on the synchronized mid-sagittal ultrasound images of the 
tongue. The /r/ tokens were coded for one of seven categories: four dorsals (trill, 
tap, fricative, approximant); two coronals (trill, tap); and deletion. 

Then we semi-automatically fitted mid-sagittal tongue surface using AAA 
(version 2.13) that also allowed for manual correction of the splines. If we could 
draw more than one spline traceable back to the same rhotic, we exported only 
the one corresponding to the closure phase for trills and taps or to the medial 
one for fricatives and approximants. At last we transferred the curves drawn 
onto the raw ultrasound image in Cartesian coordinates to a spread sheet as the 
basis of a qualitative analysis. 


3. Data 


3.1 Data analysis 

Of the 756 rated tokens, only 585 were included in the analysis (Table 3). 
Problems in tongue imaging common to most UTI research’, such as 
discontinuities in the surface contour due to asynchronies between the scan rate 
and the frame rate as well as to shadows casted by the hyoid bone, the jaw, or 
ultrasound refraction forced us to discard many tokens. This especially held for 
SB1, for whom we were only able to extract 22 out of 54 profiles in Tyrolean 
and 47 in Italian". 


Even if an auditory classification was undertaken, spectrograms were also was used to support the classifi- 
cation. 


Relevant UTI works on rhotics include, among the many others, Iskarous et al. (2010); Lawson et al. 
(2008); Proctor (2009); Scobbie & Sebregts (2011). 


11 Apparently in Tyrolean the tongue assumed a position that differed from that displayed during the instru- 
mentation set up based on inter-utterance rest positions and henceforth caused the tongue to parallel the 
beam orientation, thus refracting the ultrasounds. In Italian the phenomenon was rarer, which raises the 
more general issue of language-specific articulatory settings (Gick et al. 2004). 
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TYROLEAN ITALIAN TOTAL 
Tony hod 96 | ek dia 96 | Tokens | Coronals 96 
LSB1 42 | 100 | [r] 23 | 100 | [r] 65 | 100 
LSB2 41 |0 | [x] 30 [0| [x] 71 [0 
SB1 22 [0 | [x] 47 | 0 | hd 69 |0 
SB2 51|0| [x] 53|0| [n] 104 | 0 
SB3 51 | 0 | [x] 51 | 0| [š] 102 | 0 
SB4 49 | 0| [x] 49 | 100 | [r] 98 | 50 
SB5 48 | 0 | [x] 28 | 100 | [r] 76 | 50 
TOTAL 304 281 585 


Table 3 — Analyzed tokens per speaker; percentage of coronal rhotics and major allophone in 
each language. 


3.2 Auditory analysis 

Table 3 above contains data on the auditory analysis we ran and reports on 
the number of tokens, the percentage of coronal rhotics and the most frequent 
allophone for each speaker in the two languages. 

It was evident from our analysis that all speakers but LSB1 resorted to a uvular 
consonant (mostly [y]) to read the Tyrolean words. As far as Italian words were 
concerned, however, both uvular and apical rhotics were attested, since SB4 and 
SB5 switched between the two places of articulation according to the language 
the prompts belonged to. 

It also emerged from the auditory analysis that none of the speakers we 
considered alternated between coronal and dorsal variants within the same 
language, and that in Tyrolean no other allophone beside [y, & x] was used, 
while in Italian also [r] occurred. 


3.3 Articulatory analysis 

3.3.1 Intraspeaker comparison 

In order to assess if adult bilinguals display one or two patterns of articulation 
for rhotics in Italian and Tyrolean respectively, we considered at first the static 
articulations of the two very late sequential bilinguals LSB1 and LSB2, namely 
an almost monolingual speaker of Italian and an almost monolingual speaker 
of Tyrolean, and ran an intraspeaker comparison of their tongue profiles. Our 
analysis was based on impressionistic observations on the shape and position of 
the tongue, as well as on the statistic comparison of tongue splines. 

‘The impressionistic, graphic analysis of LSB1’s data reported in Fig. 1 shows that 
in each of the nine CRV sequences we considered ([k, g, t, d, p, b | r | u, a, iJ) 
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there is no strong categorical distinction between tongue shape and position in 
the two languages and that the two splines almost always coincide. 


[u] [a] i] 


FA 
/ 


[k g] 


[t, d] 


[p, b] 


Figure 1 — Tongue shapes for r-sounds in LSB1. See Fig. 2 for the explanation of colors. 


«C Zero line 
<=" — Difference line 


Palate 
Tyrolean 


Italian " 


Figure 2a — LSB1, mean tongue shapes Figure 2b — LSB1, radar chart of the t-test. 
for r-sounds. 


The green line at the top of the image always represents the palate, whereas the blue and 
the red line at the bottom represent shape and position assumed by the tongue in Italian and 
in the Tyrolean dialect respectively; tongue tip and blade are right, tongue root is left. As 
a mere means of orientation in the radar chart, groups of spokes can stand for the places 
of articulation in reference to the upper surface of the vocal tract. In a clockwise direction 
approximately they are: spokes 7 to 13 alveolar ridge; 14-20 hard palate; 21-25 soft palate 
(velum); 26-30 uvula; 31-35 pharynx. 
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Fig. 2a depicts the averaged spline calculated from the subset of splines 
associated with a rhotic sound in each of the two languages and shows that the 
main body of the tongue is held convex to the palate, with the antero-dorsum 
straight and steep raising and the tip down, pointing to the alveolar ridge on 
the roof of the mouth, thus defining a constriction in the post-alveolar area and 
producing almost always an alveolar tap in both languages as attested by the 
auditory analysis. 

The initial impression of similarity between the two tongue profiles is confirmed 
by the statistical analysis, which is based on the calculation of a t-test!” for each 
spoke between the two splines via the AAA integrated tool and is rendered here 
in a radar chart where the higher is the distance among the two lines, the higher 
is the difference among the two splines (Fig. 2b). 


Palate 
Tyrolean ^ 


Italian 2 


Figure 3a - LSB2, mean tongue shapes. Figure 3b — LSB2, radar chart of the t-test. 


‘The analysis of LSB2’s data offers a different image for the tongue shape and 
position, but a very similar one for the almost coincidence of the profiles in 
Tyrolean and in Italian. Extracted mean tongue surfaces (Fig. 3a) show a near 
semi-circular shape especially for Italian, with a retracted root, the dorsum held 
convex to the palate and the lamina pointing down. The tongue bunching up 
towards the postvelar zone and the absence of an alveolar constriction point 
to a dorsal articulation, which fits in with the acoustic analysis that shows a 
predominance of voiced or voiceless uvular fricatives. The statistical analysis 
(Fig. 3b) of the difference between the two splines shows that these thicken 
in the laminal and in the posterodorsal area, apparently because of a slight 
backwards shifting of the tongue which is still to be seen notwithstanding the 
poor quality of the images in the hindermost region of the tongue. 

The intraspeaker comparison of SB1 shows again almost an overlapping of the two 
contours (Fig. 4a) that display a near semicircular shape similar to that reported for 
LSB2: the tongue is mid bunched and the lamina is kept low while the middle ofthe 


7 — 2-tailed t-test, unequal variances and sample sizes, Welch-Satterwaite equation as performed by AAA. 


t-test was significant at 596. 
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tongue is raised towards the hard palate. This configuration allows the identification 
of a dorsal articulation, notwithstanding the limit in the size of the depicted palate 
that makes it difficult to precisely assess the place of articulation. Nevertheless the 
auditory analysis of this speaker's production by the two evaluators converges on 
an auditorily identical [4] as the most recurrent variant, which is further confirmed 
by the spectrographic analysis. As for the similarity between the two profiles, the 
t-test (Fig. 4b) shows that the difference among the two splines almost equals zero, 
except for two points in the foremost part of the imaged tongue? and for a point 
in the back due to an higher degree of root retraction in Tyrolean. 


Palata .— | : 
Tyrolean g m : 2 
Italian 4 aeoo line 
Y > <+*—_ Difference line 
w Á 2 
Figure 4a — SB1, mean tongue shapes. Figure 4b — SB1, radar chart of the t-test. 


For speaker SB2 (Fig. 5a) the tongue is held convex to the palate with the 
anterodorsum raising up and the tongue tip down pointing to the alveolar ridge, 
thus defining a dorsal articulation. The impressionistic and the statistical (Fig. 
5b) analysis on the difference between the two splines show that even if the 
two profiles are broadly comparable in shape, in Tyrolean the tongue tends to 
be lower than in Italian, especially in the dorsum. However, the radar chart also 
depicts how statistically significant differences emerge in the antero-dorsum 
rather than in the root. 


Palate >> _ 2 E E oy, 
Italian * d i 
aC EE h < — — ——Zero line 
i 4 Z Difference line 


Tyrolean —» 


Figure 5a — SB2, mean tongue shapes. Figure 5b - SB2, radar chart of the t-test. 


5 — Even if statistically significant data is not revealing given the poor definition of the tongue profile at the 


considered point. 
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The intraspeaker comparison of SB3 profiles (Fig. 6a) shows again two broadly 
comparable contours similar to those by LSB2 with a clear mid bunching of the 
tongue: the front, blade and tip are low, while the middle of the tongue is raised 
towards the palate to articulate even spectrographically similar uvular approximants. 
The Cartesian space shows that in Italian the tongue is kept lower but, for the 
foremost portion, which is higher. Nevertheless the radar chart associated with the 
t-test (Fig. 6b) illustrates that the difference between the two splines is significant 
but for the anterodorsal and the radical portion. The differentiation thus seems to 
involve the position of the tongue, rather than its shape. 


| Palate > j Tyrolean ji a” 
V 


— 
^ Cw 


Italian i 


Difference line 


Figure 6a — SB3, mean tongue shapes. Figure 6b — SBS, radar chart of the t-test. 


For speaker SB4, Figure 7a displays that both in Tyrolean and in Italian the 
tongue is kept smoothly convex to the palate, with no bunching or tip raising. 
Even if similar in shape, the intra-speaker comparison of tongue profiles via the 
t-test reports a significant differentiation which affects almost each point and, 
again, is due to the different position the tongue takes, lowered and retracted 
in Tyrolean, with respect to the palate. Surprisingly both the auditory and the 
spectrographic analysis gives different outcomes for the two languages and dorso- 
uvulars prevails in Tyrolean, while alveo-coronals are predominant in Italian. 


Palate > » 
Tyrolean — 5 
Y ~ 
ee A y CZ Zero line 
Italian à x Difference line 
Figure 7a — SB4, mean tongue shapes. Figure 7b — SB4, radar chart of the 
t-est. 
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Examination of tongue curves for speaker SB5 shows (Fig. 8a) that she 
articulates rhotics in the two languages in different ways: in Tyrolean her tongue 
forms a smooth convex curve with no distinct bunching; the root is slightly 
retracted, the body leaned towards the back of the mouth and the tip is far from 
determine a point of primary constriction next to the alveolar ridge. On the 
contrary, when articulating a rhotic in Italian, the body of the tongue is more 
advanced and presents a mid-bunching; the middle is more raised towards the 
hard palate while the blade and the tip are kept high, at least higher than in 
Tyrolean. Besides a saddle is to be spotted, which probably coincides with the 
place where the dorsum and the lamina diverge. 

'Ihe visual impression of a difference among the two mean splines for the 
two languages is further confirmed by the statistic and auditory analysis: as 
reported in the radar chart (Fig. 8b), there are significant differences both in 
the posterodorsal/radical region and in the laminal area; and as derived from 
the auditory analysis the speaker goes for apical rhotics ([r]) in Italian and for 
uvular rhotics ([y]) in Tyrolean. 


Palate > E 


<ltalian 


Tyrolean — 


Figure 8a — SB5, mean tongue shapes. Figure 8b — SB5, radar chart of the 
t-test. 


Data presented so far allow us to answer the first question on whether adult 
bilinguals display different patterns of articulation for rhotics in the two 
languages they speak and to affirm that apparently no space for differentiation 
is left for very late bilinguals. In fact they tend to almost completely transfer 
the shape and position of articulation from one language to the other and to 
articulate rhotics in the second language they learnt as if that were instances of 
the first language they learnt. 

On the other hand, simultaneous bilinguals tend to differentiate among 
articulation patterns in the two languages, even if with varying degrees: indeed 
as reported in Table 4 while in the case of SB1 the two splines significantly 
differ in only two points, for the rest of informants the number of points 
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increases up to more than half of the traceable profile as it is in the cases of 


SB3 and SB4. 


SB1 | SB2 | SB3 | SB4 | SB5 


N.orPoiNTS | 2 7 17 17 14 


Table 4 — Number of significantly different points among the two splines. 


As already mentioned before, intraspeaker differences in tongue splines might 
refer to a change in the position of the tongue or to modifications in the shape 
of the tongue. Changes in the position of portions of the tongue seems to affect 
SB1, SB2, SB3, SB4 especially and to ensue from the placement of the post- 
dorsum that in Tyrolean tends to be moved towards the uvula and the pharynx. 
Minor changes in the position affect also the lamina that in Italian (in all but 
one case, SB4) is shifted upwards, which is sometimes unexpected as in the 
case when uvular rhotics are produced. 

Changes in the shape of the tongue are rarer if considered from the intra- 
speaker comparison perspective, and are actually limited to SB5 who in Italian 
keeps the antero-dorsum and the lamina are high towards the hard palate 
and the alveoli. This modification is in keeping with the different acoustic 
outputs in the two languages (coronal and dorsal), but counter-intuitively is 
not to be found in SB4 despite a similar front-back alternation in her auditory 
productions. 

‘These results on intraspeaker changes in tongue position and shape are 
relevant to the phonetic characterization of simultaneous bilingual speakers 
because they point to possible space for differentiation in the articulation of 
rhotics in the two languages notwithstanding the absence of overt auditory 
differentiation for the two languages and, de facto, the transfer of a phone 
from one system to the other. This is of importance because it shows how 
articulatory data can add to the study of acoustically based theories of bilingual 
phonology, introducing previously unattested considerations such as auditory 
invariance coupled with articulatory differentiation". It also allows modeling 
of the effects of language contact within adult simultaneous bilinguals that as 
individual speech producers may serve as precursors for language change. 


^ Please refer to Vietti (2012) for an account on acoustic invariance coupled with articulatory differentiation 


in the uvular fricatives of a simultaneous Italian/Tyrolean bilingual. 
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3.3.2 Interspeaker comparison 

In order to address the second question as to whether the patterns of articulation 
of adult bilinguals resemble those of monolinguals, we ran an interspeaker 
comparison between the simultaneous bilinguals SB1-5 and the very late 
sequential bilingual LSB1 and LSB2 speakers, who acted as control subjects: 
indeed in a region characterized by societal multilingualism such as South 
Tyrol, it is almost impossible to find truly monolingual speakers. 

Our comparison is impressionistic and based on the superimposition of the 
different speakers’ palates based on translations and rotations (but not on 
rescalings) aimed at identifying the points of maximum coincidence in the areas 
of the alveolar ridge and the hard palate as shown in Fig. 9. 


Palate LSB2 


Palate LSB 


Figure 9 - Interspeaker comparison: LSB1 (blue), LSB2 (red) mean tongue shapes. 


Fig. 9 depicts the static articulation for mean rhotics in the two very late 
bilinguals LSB1 (the Italian dominat, in blue) and LSB2 (the Tyrolean 
dominant, in red). This qualitative analysis clearly illustrates that sequential 
bilinguals use two radically different tongue configurations and allows us to 
spot the two different places of articulation, the coronal (alveolar) and the dorsal 
(uvular), which is not unexpected at all given that according to previous research 
(see also Romano 2013) coronal articulation are quasi-standard in Italian while 
uvular articulations are quasi-standard in the Tyrolean dialect. 

In order to answer our second research question, the comparison between 
LSB1, LSB2 on the one hand and SB4 and SB5 on the other is, however, of 
higher relevance than that of LSB1 and LSB2 or that of SB1-SB3 because the 
two simultaneous bilinguals SB4 and SB5 are the only speakers to modify, in 
an auditorily perceptible manner, the place of articulation of rhotics in the two 
languages. 
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Palates SDN ^ 
| N | 
i ; e | o dim 
| @& 


«C -.SB5 


Figure 10a - LSB1, SB4, SB5 Italian. Figure 10b — LSB2, SB4, SB5 Tyrolean. 


Figures 10a and 10b report the graphical comparisons of tongue profiles in 
Italian and Tyrolean respectively for LSB1, LSB2, SB4, SB5, and show that 
for both languages the mean tongue profiles of the simultaneous bilingual 
diverge from the averaged profiles of the Italian dominant and of the Tyrolean 
dominant sequential bilinguals. 

As regards Italian, the tongue profile of the simultaneous bilinguals SB4 and 
SB5 differs from that of LSB1. In the case of SB4 there is no steep raising of 
the postero-dorsum but a higher rate of root retraction and a moderate lowering 
of the lamina instead. On the other hand SB5 displays a higher rate of root 
retraction and a significant lowering of the middorsum. 

As regards Tyrolean, the picture is similar and tongue profiles for SB4 and SB5 
differ from that of LSB2. Indeed even if SB4's tongue shape is similar to that of 
LSB2 and even if posterodorsum and root almost coincide, the anterodorsum 
is kept significantly lower by the simultaneous bilingual. On the other hand, 
regarding SB5, she converges towards the root retraction typical also for 
the Tyrolean-dominant speaker, but still shows a significant lowering of the 
middorsum. 

The interspeaker comparison of tongue profiles thus shows that the patterns of 
articulation for rhotics by simultaneous bilinguals are different from those used 
by almost monolingual speakers. This result is of relevance because it shows 
that simultaneous bilinguals might differ in the articulatory implementation of 
the same rhotic phonetic segments from very late sequential bilingual not only 
in the sense that, at least articulatorily, they maintain cross-language phonetic 
differences, but also that they develop new, third articulatory patterns that 
diverge from those of native speakers. 


On rhotics in a bilingual community 


4. Discussion 


The collected data, and especially the intraspeaker comparison, show that very 
late sequential bilinguals do not present distinct articulatory patterns for rhotics 
in the two languages, while the simultaneous bilingual do, even if at varying 
degrees. Besides interspeaker comparison shows that articulatory patterns for 
rhotics used by simultaneous monolinguals differ from those used by the very 
late sequential bilingual speakers who acted as control subjects. Differentiation 
of patterns might occur as a consequence of articulatory, acquisitional or 
sociophonetic factors. 

In articulatory terms, marked intraspeaker differentiation as exploited by 
simultaneous bilinguals SB4 and SB5 is used effectively to reach different 
articulatory targets in the two languages and make the speaker sound like a native 
monolingual in each of the two codes. Marked intraspeaker differentiation of the 
kind however seems to be counter-economical: rhotics are indeed known not only for 
their interchangeability, the coronal/dorsal opposition is indeed non-pathological 
in both Italian and Tyrolean, but also for the high constellations of gestures that are 
required to articulate them (Proctor 2009). This might be the reason for developing 
third articulatory patterns that apparently allow for an economic reuse of most of 
the articulatory program, except for fine tunings of tongue root and tip positions, 
which seems comparable to those attested in speakers SB1 and SB3. Indeed these 
speakers, who resort to an at least auditorily identical [s] in both languages, build 
the auditorily undetectable” but articulatory visible opposition between rhotics in 
the two languages on just one parameter, namely a change in the tongue position, 
and specifically raising vs. lowering or advancing vs. retracting of the whole tongue 
in Italian and Tyrolean respectively. 

From the acquisitional perspective, intraspeaker differentiation of patterns 
as reported for simultaneous bilinguals could occur as a consequence of the 
particular organization of bilingual speakers phonetic system. In this sense a 
proposal such as the one put forward by Flege (1995) on the basis of perceptual 
and acoustic data in the Speech Learning Model (SLM) is of interest, even if it 
only partially suits our records. If the transfer of articulatory patterns from the 
first to the second learnt language attested for LSB1 and LSB2 is compatible 
with the mechanism of phonetic category assimilation that according to the 
SLM should affect speakers with limited exposure to the second language (both 
in qualitative and quantitative terms, namely Age of Arrival and especially 
Length of Residence), the elaboration of third, merged patterns of articulation 
that apparently draws on L1 and L2 input should not be a characteristic of 


5 "o further prove this statement a broader auditory analysis and/or a rigorous perceptual study is necessary. 
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speakers exposed to the two languages for a long time. On the contrary, those 
speakers should rather operate a phonetic category dissimilation so as to increase 
the phonetic difference between the realizations in the two languages. 

Probably the SLM fails to account for data such as those presented here not 
only because the theory has not been elaborated to explain articulatory data, but 
also because of the special nature of rhotics with respect to their perceptibility. 
For example, see the research by Engstrand et al. (2007) on the perceptual 
bridge in rhotics that showed how coronal and dorsal rhotics may occasionally 
be confused in perception so that "intended coronals could be interpreted as 
dorsals or viceversa" (2007:176). And, most of all, because data compared here 
refer to simultaneous and not to (very late) sequential bilinguals. 

Moreover our data pertain to simultaneous bilinguals raised in a societal 
bilingualism situation. As this difference is of sociophonetic relevance, it should 
not be disregarded; indeed it should be stressed here that attitudinal factors 
might also play a role. In particular, the decision of simultaneous bilinguals to 
characterize themselves as members of one of the two established linguistic 
communities or as members of the a truly bilingual community might favor 
the use of two separate patterns of articulation (as in SB4 and SB5) or the 
development of a third system of articulation (as for SB1, SB2 and SB3) to 
index respectively their identities. In this sense rhotics would prove once more 
to be the preferred markers of local identity and/or of social variation selection. 


5. Conclusion 


‘This study aimed to add new data and details to previous work on the phonetics 
of rhotics in Italian and Tyrolean, and showed how variable this class of sounds 
proves to be if considered from an articulatory perspective. In addition, it aimed 
to offer new data for the study of the phonological systems of bilingual speakers 
and showed how previous proposals such as SLM can be put to the test simply 
through the adoption of articulatory data. 

However, the authors of this paper are well aware that the results are preliminary, 
and therefore not conclusive. First of all, there were limitations in the size of the 
dataset used to derive their observations, and especially the representativeness 
of those observations. Secondly, the image quality was sometimes poor, and 
in particular the image resolution was poor enough to sometimes distort the 
derived representation of the tongue shape. Lastly, the authors recognize the 
limitations of the impressionistic technique used to evaluate the data, especially 
in comparison to quantitative analysis as permitted by techniques such as 
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SS-Anova (Davidson 2006); or the nearest neighbor distance (Zharkova & 
Hewlett 2009). 
‘This, togheter with interspeaker normalization, will be addressed in future research. 
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Appendix 1 
Italian target words 


privo, prato, prude, triste, trave, truce, cricca, crampo, crudo, briga, bravo, bruco, dritto, 


drago, druso, grave; grido, gruppo. 


Tyrolean target words 


prigl, pratzl, prunzen, trichtor, traktor, truhe, krischtn, kravall, krustn, brikett, brathiandl, 
bruscht, driber, dran di, druckn, grint, graf, gruslig. 
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Abstract 

'Ihe first of two studies in this paper (both using electromagnetic articulography) 
focused on onset clusters in German and French. Less overlap of C1 and C2 was found 
in plosive-nasal and plosive-rhotic clusters compared to plosive-laterals. Articulatory 
modeling was used to identify why the preferred coordination patterns are acoustically 
advantageous, and implications for metathesis and other diachronic processes are 
discussed. The second study analyzed the syllabic consonants /1/ and /r/ in Slovak. 
‘These consonants did not become kinematically more ‘vocalic’ in nuclear compared to 
marginal position. However, nuclear consonants preferred low-overlap coordination 
with the preceding consonant, compared to onset clusters and to vocalic syllables. We 


suggest that a low overlap setting favours the emergence of syllabic consonants. 


1. Introduction 


We consider two areas in which rhotics have proved fruitful for arriving at a 
better understanding of principles of articulatory coordination in consonant 
sequences. Both studies are based on recently-acquired articulatory (EMA) 
data. For the first area we look at onset clusters consisting of plosive plus lateral, 
nasal, or rhotic in German and French. The overall goal is to understand why 
these obstruent-sonorant clusters differ synchronically in their frequency of 
occurrence across languages (and differ in diachronic stability). Ihe second area 
focuses on syllabic consonants (lateral and rhotic) in Slovak, seeking, in a rather 
similar vein, to understand why syllabic consonants are typologically rare, and, 
concomitantly, what factors may favour their emergence when they do occupy a 
prominent position in the sound structure of a language, as is the case in Slovak. 
Since for the syllabic consonants we focus in particular on their coordination 
with adjacent consonants there are substantial superficial similarities in the 
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kinds of sounds sequences examined in both parts of the paper. And both parts 
are united at a less superficial level in that they aim for a better understanding 
of general principles of coordinating consonant with consonant and consonant 
with vowel, and how these principles are affected by position in the syllable and 
the segmental make-up of the sound sequences involved (for more background 
to our overall approach see e.g. Pouplier 2012). 


2. Obstruent-sonorant clusters in German and French 


In this section we first review earlier work in which we compared clusters such as 
/kl/ and /kn/, and then move on to more recent analysis of plosive-rhotic clusters. 


2.1 Plosive plus lateral and nasal 

The earlier findings (e.g. Hoole et al. 2009; Bombien et al. 2010; Bombien et 
al. submitted) revealed a consistent pattern of less articulatory overlap between 
C1 and C2 in German clusters such as /kn/ compared to /kl/. 


14 1.45 15 1.55 16 1.65 1.7 1.75 1.8 
Time (s) 


Overlap (normalized%)=((Offset_2-Onset_4)/(Offset_4-Onset_2))*100 


More positive values indicate more overlap of Phase 2 and Phase 4 


Figure 1 — Illustration of measurement of articulatory overlap using EMA data. Top panel: 
audio; middle panel: vertical component of tongue-tip movement; bottom panel: vertical 
component of tongue-back movement. Phases 1 and 3 extend from onset of movement 
towards consonant target up to attainment of target position; both time points are based 
on a 20% velocity criterion. Phases 2 and 4 delimit the target plateau region. In the formula 
for overlap calculation 'Offset 2' refers to the time point of the right boundary of Phase 2 
(analogously for other time points). 
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Fig. 1 illustrates how these measurements were carried out using EMA sensors 
(Carstens AG500 articulograph) located on tongue-tip (indexing constriction 
for /1/ and /n/) and tongue-dorsum (indexing constriction for /k/). 

Various overlap measures have been suggested in the literature. The one 
used here is based on onset and offset of the target plateau regions for C1 
and C2. If (Offset 2 - Onset 4) in the formula is positive this indicates 
that the articulatorily defined constriction for C2 has been reached before 
the constriction for C1 has been released. If this difference value is negative 
(actually the normal case in our data) this indicates a lag between the end of 
C1 constriction and the beginning of C2 constriction. 'To account to some 
extent for differences in speech rate between utterances and subjects the values 
are normalized by the total duration of the constriction phases (i.e. by the 
difference in time between the offset of C2 and the onset of C1). 

The finding of a low degree of overlap in /kn/-clusters is probably to be 
explained by the fact that premature lowering of the soft palate for /n/ would 
destroy the acoustic characteristics of the /k/-burst. This interpretation has 
been confirmed by modelling work using TADA (Nam et al. 2009). TADA is 
an articulatory synthesis application based on task dynamics and the coupled 
oscillator model of syllable structure. It allows gestural parameters to be 
systematically modified, with the final synthesis being performed by generating 
control parameters for the pseudo-articulatory synthesizer HLSyn (Hanson 
& Stevens 2002). The connection with HLSyn is particularly interesting in 
this case, because HLSyn also synthesizes the pressures and flows in the 
vocal tract resulting from the articulatory input specification. Accordingly, it 
was possible to observe that when a plosive-nasal cluster is synthesized with 
TADA's default coordination relations then the intraoral pressure declined 
prematurely during the plosive because of nasal leakage, resulting in an 
absence of a burst at the articulatory release of the plosive. By using a low- 
overlap coordination topology originally suggested by Goldstein et al. (2009) 
to capture the difference between so-called homogeneous (high overlap) and 
non-homogeneous (low-overlap) clusters in Georgian the air-pressure trace 
became more typical of a plosive. However, for a completely satisfactory result 
it was also necessary to adjust the gestural parameters of the velar gesture itself 
(rather than just globally adjusting the overlap between C1 and C2) to ensure 
that it made a sharp transition from closed during oral closure for the plosive, 
to open for the following nasal. ‘Thus it seems that plosive-/n/ clusters may be 
plausibly regarded as physiologically costly. 

Fig. 2 compares the air-pressure traces resulting from the default and the 
‘tuned’ coordination parameters when synthesizing a [pn] consonant sequence 
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(preceded and followed by a vowel). 'Ihe key points to notice are that in the 
curve labeled ‘untuned’ the peak air pressure is not quite as high, and also does 
not maintain a plateau after reaching its maximum at about 100 ms on the time 
axis. Since labial closure is not released until about 150 ms this indicates nasal 
leakage of air in the untuned case. 
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Figure 2 - Intraoral airpressure in synthesized /pn/ sequence comparing standard coordination 
parameters (‘untuned’) with adjusted ones (‘tuned’). 


Overall, these results fit well into our guiding hypothesis that ‘successful’ 
clusters (/kl/ is clearly diachronically more stable than /kn/) are those that offer 
a good compromise between parallel transmission of segmental information 
(efficient production) and adequate recoverability in perception (cf. Chitoran 
et al. 2002), i.e. key acoustic features of /kl/ would be maintained even at 
high overlap, whereas /kn/ would suffer from reduced perceptibility due to 
impairment of burst characteristics from nasal air leakage. 


2.2 Plosive-rhotic clusters 

More recently, we have compared the onset clusters /pl, bl/ with /pr, br/ (and 
also /f1/ with /fr/) for five speakers of French and four of German’. The basic 
procedure remains as in Fig. 1, except that now a sensor on the lower-lip is 
used to analyze articulatory activity for C1, and, since all speakers produced 
a dorsal /r/ the tongue-back sensor was used to analyze C2 in the /r/-clusters. 
‘The plosive-rhotic clusters are of interest precisely because it is not immediately 


We will be using /r/ as a phonemic symbol to indicate what was in fact a dorsal articulation in the uvular 
region: approximant or voiced fricative following voiced C1, usually voiceless fricative following voiceless 
C1. A fifth German speaker who produced an apical variant was left out of consideration here. Obviously, a 
systematic comparison of apical and dorsal rhotics would be an interesting task for future work. 
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clear what overlap pattern to expect. On the one hand, based on sources such 
as Vennemann (2000, 2012) there seems no reason to assume that /r/-clusters 
are disfavoured compared to /l/-clusters (if anything, the reverse). On the other 
hand, there are well-documented cases of instability involving r-clusters, namely 
metathesis such as the following?: 


1. French, standard vs. dialectal: premier: /psoemje/ vs. /pekmje/ (from Russell 
Webb & Bradley, 2009); 

2. Germanic: ross (Icelandic) vs. horse (rhotic English dialects); 

3. English, standard vs. dialectal: pretty vs. perty. 


In fact, there was a very consistent result of lower overlap in the /r/-clusters 
compared to the /l/-clusters (see Fig. 3). 


Percent overlap. German Percent overlap. French 
9 0 
-10 - -10 
-20 4 -20 
% -30 4 
% -30 l 
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os = 50) mg C2=/r/ 1 
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C1 -60 f p b 
C1 


Figure 3 - Articulatory overlap in onset clusters /fl, fr, pl, pr, bl, br/ (from left to right in each 
panel). Overlap computed as illustrated in Fig. 1: for /I/-clusters overlap of lip and tongue- 
tip; for /r/-clusters overlap of lip and dorsum. Averaged over 4 speakers of German and 5 
speakers of French. More negative values indicate less overlap. Error bars indicate standard 
error of mean over speakers. 


‘The examples of metathesis just mentioned would emerge rather naturally from 
low overlap between the consonants of the onset cluster, particularly if this were 
accompanied in turn by a high degree of overlap between the rhotic and the 
following vowel. This would indeed be a prediction of the c-center principle of 
coordination identified by Goldstein et al. (2009): as the number of elements 


A reviewer suggested that these examples might be better captured by vowel insertion before the rhotic fol- 
lowed by deletion of the original post-rhotic vowel, rather than metathesis in the traditional sense. Such a 
scenario would fit in equally well with the patterns of gestural shift and the link between gestural overlap 
and vowel epenthesis that we discuss below. The label attached to these examples is less crucial than the ba- 
sic point that the seeds of change and instability may be found in specific coordination relations. 
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in the onset increases the left edge of the onset moves to the left and the right 
edge to the right leaving the center of the onset in the same position relative to 
an anchor point in the vowel, regardless of the number of elements in the onset. 
Extending this to the present case, a complex onset with a low degree of overlap 
should show a particularly pronounced right shift of the rightmost consonant 
over the vowel compared to a control simple onset condition’. 

Currently, only a rather small subsection of our corpora is suitable for testing 
this prediction because we require target items that contrast simple and complex 
onsets but have the same nucleus vowel and coda (in practice the formation of 
consonantal closure for the coda usually provides a kinematically better-defined 
anchor-point than a time-point in the vowel). Thus the test is not as rigorous as 
we would like. Fig. 4 shows the results for the items that we were able to select 
from the German and French corpora, namely tat vs. trat for German and bac 
vs. braque for French. 


German French 
Ita:t/ vs. /tra:t/ /bak/ vs. /brak/ 


-300 -200 -100 0 -300 -200 -100 0 
Time re. coda onset (ms) Time re. coda onset (ms) 


Figure 4 — Timing of syllable onsets with and without rhotic relative to common anchor point in 
coda consonant, i.e. relative to achievement of /t/ constriction target for German (left) and of 
/k/ constriction for French (right). 


The German data basically show the pattern expected from the c-Center 
hypothesis: the right edge of the complex onset (i.e. the right edge of /r/ in 
/tra:t/) is further to the right than the right edge of the simplex onset (/t/ in 
/ta:t/). However, in marked contrast, for the French data the right shift of the 
right edge is very weak, whereas the left edge of the onset (i.e. the left edge of the 
/b/) shifts substantially to the left. This is not the pattern that would be expected 


5 — Russell Webb & Bradley (2009) take this line of thought even a step further by simply assuming in their op- 
timality theory account of metathesis that the centre of the rhotic is coordinated with the centre of the vowel. 
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if we want to argue for a particular affinity between the rhotic and the vowel. 
Given these mixed results, it would clearly be premature at this stage to claim 
that this kind of metathesis is accounted for by a particularly strong propensity 
of the rhotic to overlap the following vowel (with the potential for a categorical 
diachronic shift in position). Nevertheless, it would clearly be interesting to 
follow up this line of analysis with a new corpus that is purpose-designed 
to provide appropriate anchor points. It is also worth noting here a further 
prediction that emerges from the contention that metathesis is related to the 
degree of consonantal overlap: based on the results shown in Fig. 3 it should 
be less common in clusters with lateral than in clusters with rhotic*. Currently 
we are not aware of any quantitative data from the sound-change literature that 
allows this question to be answered”. 

Even though the articulatory patterns found in rhotic clusters were not 
necessarily the ones initially expected we have recently been able to use 
articulatory synthesis to gain some further insight into why speakers appear to 
avoid high overlap in these clusters. For this we used the Vocal'TractLab package 
(Birkholz 2012; Birkholz et al. 2006). As a point of departure we used gestural 
timing parameters that would give a reasonable approximation to the German 
syllable onset /br/ as in brat. The overlap between the onset consonants was 
then increased by 50 ms. The most striking result was that the duration of 
voicelessness following release of the /b/ increased substantially (sounding 
perceptually more like /pr/ than /br/), even though no changes were made 
to the synthesis control parameters directly related to voicing. This indicates 
that a dorsal constriction in the velar or uvular region results in aerodynamic 
conditions that are very unfavourable to restarting voicing (German /b/ is 
essentially voiceless during the labial closure) if it follows very shortly after the 
labial release. This illustrates quite elegantly how supraglottal coordination can 
have repercussions on the realization of the voicing contrast. 

Fig. 5 illustrates these results by showing the sonagrams for the normal and 
high-overlap condition. 


See Yanagawa (2003) for an illustration of how constraints on gestural overlap and cohesion may underlie 
certain metathetic processes in Hebrew. 


There may well, however, often be a higher rate of apparent vowel epenthesis in rhotic than lateral clusters. 
This is discussed in detail in section 2.3 below. 
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/ bra/, normal overlap / bra/, high overlap 
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Figure 5 — Comparison of acoustic output for onset consonants of syllable /bra/ synthesized 
with normal overlap (left) and high overlap (right). Note greater duration of voicelessness after 
/b/ release in right panel. 


2.3 Articulatory coordination in onset clusters: Implications and 
further discussion 

We believe that the analysis of articulatory coordination presented in the 
preceding sections can throw useful light on the phonological processes 
apparently affecting these clusters. For example, based on some interesting 
acoustic observations of stop+lateral and stop+rhotic clusters in French 
Colantoni & Steele (2007, 2011) point to the particular prevalence of vowel 
epenthesis in voiced stop+rhotic sequences. Epenthesis is much rarer in voiced 
stop+lateral sequences and virtually non-existent in voiceless stop+rhotic 
sequences. The latter in turn are claimed to be particularly affected by a 
process of voicing assimilation since devoicing of the liquid is stronger in 
voiceless plosive+trhotic than in voiceless plosive+lateral. We feel, however, that 
there are grounds for caution if the claim is that e.g. /br/ and /pr/ are affected 
by radically different processes, at least if a process such as epenthesis is to be 
interpreted as a cognitive operation on the phonological representation (with 
the aim, in Colantoni & Steele’s terms of cluster ‘simplification’ or ‘repair’). 
Looking back to Fig. 3 it is clear that the articulatory overlap between C1 
and C2 is very similar for the voiced/voiceless pairs /br/ and /pr/. To us, this 
immediately reduces the attractiveness of assuming epenthesis just for the 
first case but not the second. The introduction of a vocalic element between 
two consonants should clearly affect the observable coordination relations 
between these consonants (assuming epenthesis in both cases, with the 
epenthetic vowel invariably voiceless in /pr/, might be a logical possibility 
but nonetheless not particularly attractive or useful). 


* — Colantoni & Steele (2007, 2011) also discuss Spanish, where the situation is different: i.e. apparent epen- 
thesis following both voiceless and voiced stops. Probably language differences in apicality vs. dorsality are 
relevant here. 
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Essentially, we would argue that /br/ and /pr/ have a very similar gestural 
specification in terms of the coordination of C1 and C2. Whether an 
epenthetic vowel appears at the acoustic surface is simply a side-effect of 
the voicing properties of C1 and does not require an account in terms of 
phonological processes. (This idea of epenthetic vowels as a side-effect of 
voicing patterns is supported by the informal observation that they are 
much less obvious in German /br/ than French /br/: phonologically voiced 
plosives are indeed fully voiced in French but usually substantially devoiced 
in German so the strength of voicing in the period immediately following 
/b/ will be much weaker in German, weakening in turn any impression of a 
vocalic transitional element.) Note that Colantoni & Steele’s observation of a 
weaker tendency to epenthesis in the lateral clusters also fits in well with our 
overlap measurements: i.e. the higher overlap for lateral than rhotic clusters. 
The crucial question is then what drives the low overlap in rhotic clusters. 
We indicated above one direction that an explanation could take: dorsal 
constrictions may be particularly unconducive to voicing (e.g. Ohala 1993; 
see also Colantoni & Steele 2011); accordingly, reducing the amount of 
overlap reduces the chances of an inappropriate amount of voicelessness 
at the release of a phonologically voiced consonant. Note that we assume 
that this could apply equally to German and French voiced stops despite 
their clear phonetic differences: excessive overlap may result in a delay in 
re-initiation of voicing after devoiced German /b/, but also in interruption 
of voicing of normally continuously voiced French /b/. A corollary of this 
line of thought also explains Colantoni & Steele’s further observation of the 
particularly extensive devoicing of /1/ in /pr/: even if French is traditionally 
regarded as not having long VOT in voiceless stops the glottal conditions at 
the release of /p/ are certainly not favourable to voicing, and probably remain 
so over the transitional period until the formation of constriction for /r/. Since 
voicing is well-known to show a hysteresis effect in the sense that conditions 
for initiating voicing are more stringent than those necessary to maintain 
ongoing voicing (e.g. Hirose & Niimi 1987) then once voicing has ceased at 
the onset of /p/ re-initiation is not possible until the dorsal constriction has 
substantially weakened at the offset of /r/. This means that the low overlap 
between rhotic and plosive can on the one hand make it easier to maintain 
(or re-start) voicing for voiced C1 but also result in a particularly long period 
of voicelessness once voicing is interrupted for voiceless C1. Once again we 
would argue that the devoicing of /r/ in /pr/ does not require an explanation 
in terms of phonological processes but is, to a first approximation, a simple 
coarticulatory effect of the devoicing gesture of the initial C1 in combination 
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with the effect of a dorsal constriction on aerodynamic conditions in the 
vocal tract". 


3. Syllabic consonants in Slovak 


This second main section continues very much in the vein of the first, since 
it will again provide evidence that an understanding of the emergence and 
development of sound patterns can depend crucially on an understanding of 
the patterns of articulatory coordination involved. 

As already mentioned in the introduction, the overall aim of the work in 
this section was to arrive at a better understanding of why the occurrence of 
syllabic consonants is highly restricted. This can be understood as part of the 
ongoing aim of ourselves and others to understand how sounds are modified 
and their coordination patterns change depending on their role in the syllable. 
Many investigations have compared consonants in onset vs. coda position; here 
we now look at consonants in nucleus position based on work carried out by 
Pouplier & Beňuš (2011). This leads to a number of more specific questions 
along the following lines: 


- To what extent do syllabic consonants become more vocalic? 

- . How does coordination of two consonants differ for example, when both 
are in the onset versus when one is in the onset and one in the nucleus? 

- Does onset-nucleus timing depend on whether the nucleus is vocalic or 
consonantal? 


‘The basic research strategy is to exploit a language such as Slovak in which the 
occurrence of syllabic consonants is actually quite unrestricted. In addition to 
the fact that specifically /1/ and /r/ occur in nucleus as well as onset and coda 
position, these syllabic consonants are — unlike English, German etc. — not 
restricted to unstressed syllables and can themselves take complex onsets (as in 
words like szrz, with nucleus /r/). Moreover, they are fully integrated into the 
Slovak morphology of nucleus length alternations (see Pouplier & Beňuš 2011, 
for further details on the linguistic background). 


We use the proviso “to a first approximation” because, extrapolating from our earlier work on German using 
laryngeal fiberoptic endoscopy and transillumination (Hoole 2006), we expect there to be subtleties to glot- 
tal coordination in clusters that we are not yet able to do justice to in French. This work is currently in 
progress. For more background to the general idea of coarticulatory devoicing see e.g. Browman & Gold- 
stein (1986). 
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3.1 Recordings 

Basically the same EMA setup was used as for the experiments described in 
Section 2 above (sensors on tongue, lips, jaw). Five Slovak speakers participated. 
Six repetitions of each target word were recorded per speaker. Target words were 
embedded in the carrier phrase: Už hovoríme hodinu. Examples of the 
target words are given in each analysis section below. 

Two main sets of analyses were performed. First, the basic kinematic properties 
of the liquids were examined as a function of position in the syllable; second, 
analyses of articulatory coordination similar to those exemplified already in Fig. 
1 were carried out. 


3.2 Basic kinematic properties of /I/ and /r/ 

The main thrust of this group of analyses was to determine whether the liquids 
became in any sense more vocalic when they formed the nucleus (as opposed 
to onset or coda). In terms of kinematic measurements this was defined as an 
expectation for longer durations, lower velocities and lower stiffness (ratio of 
amplitude to peak velocity) in nucleus position. 

‘The following list shows the words used to compare the kinematic properties of 
the three syllable positions (upper case L is used here and in Table 2 below to 
indicate a liquid nucleus). 


Onset CVC: lak, lob; rak, raky, rok 
Nucleus CLC: chlp, blb; mrk, krk, krb 
Coda CVC: kal, mol; bar, ker, mor 


Table 1 shows results averaged over the five speakers. For brevity, only duration 
of the consonantal constriction phase (plateau duration’; this corresponds to 
Phase 4 in Fig. 1) and peak velocity are shown here. 'The velocity measure is 
based on the closing movement. 


ONSET NUCLEUS CODA 
Lupe A 53.3 (12.8) | 49.6 (27.0) | 40.5 (13.7) 
rf 15.3 (8.3) | 27.7 (15.4) | 19.4 (12.2) 
FER VECES /V/ 31.8 (15.0) | 24.1 (8.8) | 35.1 (13.1) 
/1/ 39.4 (13.0) | 45.4 (11.9) | 51.7 (14.2) 


Table 1 — Mean (and standard deviation) plateau duration and peak velocity for /l, r/ as a 


function of syllable position. 
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The basic point to observe is that there is no consistent pattern distinguishing 
nucleus from onset and coda; the nucleus is not consistently longer and does 
not have lower peak velocity than the marginal positions. If only onset and 
nucleus are compared then there are no patterns that are consistent across both 
consonants. [hus at this level of analysis it does not seem that liquids take on more 
vocalic properties when they form the syllabic nucleus: syllabic consonants are 
kinematically speaking still consonants (the results given here are representative 
of the other measures as well, see Pouplier & Benus 2011 for details). 


3.3 Articulatory coordination 

Articulatory coordination will be examined from two complementary points 
of view, firstly in terms of consonant-consonant coordination (comparing pairs 
where one member of the pair contains the nucleus and one does not), and 
secondly in terms of onset-nucleus coordination (comparing consonantal versus 
vocalic nuclei with the same onset). 


3.3.1 Consonant-consonant coordination 

For this analysis pairs such as mrak vs. mr& (onset cluster vs. onset+nucleus) and 
park vs. mrk (coda cluster vs. nucleus+coda) were examined (target consonant 
sequence highlighted in boldface). Coordination between the consonants was 
captured by a measure that we will refer to as plateau lag, corresponding to the 
timepoint of the onset of Phase 4 minus the timepoint of the offset of Phase 2 
in Fig. 1. The results are given in Table 2. 


PLATEAU LAG (MS) 
/V /t/ 
onset cluster (CC-) | 50.2 (20.5) | 83.7 (16.2) 
onset-nucleus (CL) | 64.4 (19.1) | 90.2 (16.5) 
coda cluster (-CC) | 23.9 (17.8) | 27.1 (19.0) 
nucleus-coda (LC) | 32.3 (15.5) | 40.8 (13.2) 


Table 2 — Mean plateau lag (and SD) for consonant-consonant sequences differing in syllable 
position. 


ONSET 


CODA 


Please note that since this table shows a lag measure (rather than the overlap 
measure used in Section 2) larger (more positive) values indicate a wider spacing 
between the consonants (i.e. less overlap). Ihe main result to note is that the 
lag is greater (overlap is less) when the liquid is in the nucleus, i.e. CL and 
LC versus CC- and -CC, respectively. The other main result, which in fact is 
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numerically stronger than the first result, is that lags are greater in onset than 
in coda position (i.e. compare the first two rows to the bottom two data rows of 
the table). In traditional phonetic terminology this means that CC transitions 
are more open (in the sense of Catford 1977) to the left of the nucleus. Note 
that this applies not just to the comparison of onset clusters vs. coda clusters 
(comparing data rows 1 and 3 of the table) but also to the comparison of 
onset+nucleus vs. nucleus+coda (data row 2 vs. 4). For standard syllables with a 
vocalic nucleus it has become almost a commonplace observation in recent years 
that syllable structure is expressed in typical coordination relations among the 
structural elements of the syllable. ‘The above two results make the important 
point that this also applies to syllables with a consonantal nucleus, i.e. these 
syllables, too, have internal structure: words like mrk are not just a simple 
concatenation of C«C«C*. Putting this another way: for any given sequence 
of consonants in Slovak the precise coordination relations among adjacent 
consonants will depend on the structural position in the syllable to which each 
consonant is assigned. 


3.3.2 Onset-nucleus coordination 

The second set of coordination analyses compares words such as 4/4 (lateral 
nucleus) with words such as Jib (vocalic nucleus). Preliminary examination 
indicated that articulatory movement for the vowel could be more reliably 
captured in terms of time of peak velocity of the movement towards the vowel 
(rather than in terms of the time of attainment of a constriction plateau), so a 
new lag measure was defined as timepoint of peak closing velocity for nucleus 
minus timepoint of peak closing velocity for onset consonant. The results of 
peak velocity lag for the different nucleus types averaged over speakers were as 
follows (mean +/- standard deviation, in ms): 


Vowel 82.5 +/- 23.7 
/V 106.4 +/- 16.4 
/r/ 151.9 +/- 10.2 


Lag values are shortest for the vocalic nuclei. Note, though, that there are also 
clear (and statistically significant) differences between the rhotic and the lateral 
nuclei. As in Section 2 (and as just pointed out in footnote 8) the rhotics show 
particularly high lag (low overlap). 

The consonant-consonant coordination results indicated that syllables with 


8 — [tis perhaps interesting to point out a similarity with Section 2 here: lags are generally greater for the 


rhotics than the laterals (especially in onset position). Note that this is the case even though the rhotics are 
apical in Slovak but dorsal in German and French. 
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consonantal and vocalic nuclei have similar internal structure. The present 
results for onset-nucleus coordination make clear that coordination patterns for 
consonantal and vocalic nuclei are nevertheless not necessarily identical. 


3.4 Syllabic consonants: Discussion 

As part of the general discussion of the Slovak results we will now try to offer 
some ideas as to what could be driving the low-overlap coordination pattern for 
the consonantal nuclei. 

Useful background is given by the assumption in much work in articulatory 
phonology and coupled-oscillator models of syllable structure that onset 
consonants are timed in-phase with the following vowel (in effect, their activity 
starts at the same time). In normal CV syllables this does not result in the vowel 
being obscured by the consonant because the vowel has longer duration (lower 
stiffness). For the Slovak syllables with consonantal nuclei this would be a 
problem, however, because as we saw in the first part of the results consonants in 
nucleus position do not have clearly different durational properties from those 
in marginal positions. Thus for the onset C and the nucleus C to be reliably 
recoverable by the listener a low-overlap pattern of coordination is required. 
‘This links up in turn with ideas in the first part of the paper: one reason for the 
typological rarity of syllabic consonants may be that they require a departure 
from default CV coordination patterns. Put slightly differently, syllabic 
consonants interrupt the basic construction principle for spoken language of a 
slow continuous vocalic substrate with overlaid consonantal constrictions. 

We believe that one reason why these typologically rare patterns were able to 
emerge in Slovak is that the language in general favours a low-overlap setting 
for consonant-consonant coordination. [hus while the values for plateau lag for 
onset clusters shown in the first data line of Table 2 are shorter than those for 
the onset+nucleus case (in the second line of the table) they are nevertheless still 
quite long in absolute terms, long’ meaning that it is very typical in Slovak to 
find a sonorant transition ('epenthetic schwa’) between C1 and C2 (see Pouplier 
& Beňuš 2011, for further details and illustrations). 


4. Conclusions 


The two main parts of this paper have both tried to make the point that 
understanding how sound systems develop crucially requires an understanding 
of how the various articulatory subsystems are coordinated by the speaker. 
Combining articulatory measurements with articulatory modeling helps in turn 
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to better understand what coordination patterns result in sound structures that 
can reliably be recovered by the listener in perception. Rhotics, and liquids in 
general, are particularly advantageous in these contexts because complex syllable 
structures make such heavy use of them. 
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Abstract 

We investigate the lingual shapes of the five liquid phonemes of Malayalam: two rhotics, 
two laterals and a more problematic 5 liquid. Ultrasound is used to image the mid- 
sagittal tongue surface, mainly in an intervocalic within-word /a — a/ context. The dark 
retroflex lateral and trill have a retracted tongue root and lowered tongue dorsum, while 
the three other clear liquids show advanced tongue root and dorsal raising. The 5* liquid 
is post-alveolar and laminal. Some additional data from an /a  i/ context is considered: 
the liquids are slightly clearer before /i/: all have a slightly advanced tongue root, and all 
bar the trill show palatalization. Dynamically, the trill and retroflex lateral have a very 
stable tongue root in /a__a/, and the 5" liquid has unusual anterior kinematic properties 


which require further investigation. 


1. Introduction 


1.1 Background 

As part of its liquid inventory, Malayalam (a Dravidian Language of southern 
India, Krishnamurti 2003) has two rhotics (/r/ and /r/, a trill and a tap), two 
laterals (/l/ and /|/, an alveolar and a retroflex lateral respectively) and a 5% 
liquid, most commonly labelled /z/. This last segment has been analysed either as 
a rhotic, specifically a ‘voiced sublamino palatal approximant’ (Asher & Kumari 
1997:419) or as a lateral, specifically a ‘voiced retroflex palatal fricativised lateral’ 
(Kumari 1972:27-28). The only two experimental studies on this general topic 
prior to 2010 were on the related language Tamil, and in these the 5" liquid 
was classed as a central retroflex approximant (McDonough & Johnson 1997; 
Narayanan et al. 1999), but more recently, Punnoose (2011) and Punnoose et 
al. (2012) have explored the Malayalam liquids from both phonological and 
phonetic points of view (as well as comprehensively reviewing the existing 
literature), revealing greater complexity. These recent papers draw attention to 
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the importance of secondary resonances in the system of oppositions as well as 
contrasts based on primary manner and place of articulation. 

In addition, some contemporaneous research on Kannada is using ultrasound 
to explore retroflex stops in relation to other lingual obstruents (Kochetov et al. 
to appear; Kochetov et al. 2012), which will provide results that will dovetail to 
some extent with those presented here. 

While it might be possible to definitively classify the 5" liquid either as a rhotic, 
a lateral or a non-rhotic central approximant, a non-deterministic or ‘fuzzy’ 
approach to phonological systems (cf. Scobbie & Stuart-Smith 2008) assumes 
there may be underlying phonological and phonetic reasons for its ambiguous 
status. The phonological patterning of /z/, for example, offers a somewhat mixed 
picture regarding its lateral/rhotic identity, and the phonetic characteristics 
relevant to distinguishing it from the other liquids might likewise be variable or 
gradient. On the one hand, the rhotics and the 5^ liquid are the only consonants 
not to have a singleton-geminate contrast. On the other hand, /7/ and /|/ tend 
to alternate in certain morpho-syntactic contexts. For more detail on these 
complex patterns, see Punnoose (2011). 

Punnoose (2011) and Punnoose et al. (2012) have shed new light on the sound 
system using production data from eight adult males. These detailed studies 
constitute the first acoustic investigation of Malayalam /z/. One of their aims 
was to consider the hypothesis that /7/ is a third rhotic. Also, in (mainly auditory) 
impressionistic terms, they have found that /7/ sounds like a clear post-alveolar 
central approximant, and that it appears to lack retroflexion, in the sense in that it 
lacks strong retraction with perhaps sublaminal contact during a forward-moving 
constriction. 

In addition, they have explored the acoustic phonetic nature of the liquid 
system, exploring a number of parameters that can distinguish (or not) these 
segments from each other. Acoustically, however, it is hard to definitively 
categorise /z/ as either rhotic or lateral. Its first two formants (especially F2) 
were found to be close to the values for one rhotic (the tap) and one lateral 
(the alveolar approximant), namely /r/ and /1/, which Punnoose and colleagues 
classify as having a clear resonance, while its F3 and F4 pattern closer to the 
values for /r/ and /|/, which they class as having a dark resonance. 


1.2 Summary of methodological and descriptive goals 

The rhotic system of Malayalam has a binary phonological opposition, which is 
traditionally characterized as tap vs. trill. A secondary phonetic correlate of this 
contrast is a clear vs. dark resonance difference (for the tap vs. trill respectively), 
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which is the same secondary resonance distinction found in the alveolar and 
retroflex laterals (respectively). It would therefore be useful to get information 
on the secondary and primary articulation of both rhotics, both laterals, and 
the 5* liquid. No single articulatory technique can provide all the information 
required, but ideally we would use one which is: 


a) Fast enough to be able to image the tongue cleanly during the short 
constriction phase, and therefore able to produce a single lingual shape to 
characterize both approximant and more ballistic manners of articulation; 

b) Frequent enough to be able to reveal aspects of the fast changes in tongue 
shape and location for coarticulatory and ballistic motions before, during 
and after the liquid; 

c) Capable of revealing aspects of tongue root articulation relevant to 
secondary articulation as well as aspects of the blade and tip kinematics 
relevant for the primary place and manner of articulation. 


Here we use high-speed Ultrasound Tongue Imaging (hs-UTT). Ultrasound 
scanning provides us with a mid-sagittal, two dimensional view of the tongue 
(Davidson 2012). Ultrasound is a convenient non-invasive technique, but it 
should be noted that in order to stabilize the probe to the head, protocols need 
to be adopted which shorten data collection time due to speaker fatigue, and also 
that high-speed UTI requires more expensive and specialized instrumentation 
than normal video-based UTI. The low cost and portability of the latter make it 
more likely to be used in fieldwork (Gick 2002, Lawson et al. 2008, Lawson et 
al. 2011), but its longer data-capture window is more subject to spatiotemporal 
artefacts (Wrench & Scobbie 2006) and thus is harder to synchronize to the 
acoustic signal. For many purposes, the approximately 60 frames per second 
that is possible with de-interlaced video UTI should prove sufficient, but to 
capture details of very fast moving ballistic flaps or clicks, for example, hs-UTT 
is likely to be preferable (Wrench & Scobbie 2011). The non-dynamic findings 
presented here would be easily observed using video UTI. 

We will look for articulatory correlates of the clear vs. dark distinction by 
examining the position of the tongue root and dorsum. The relative location 
of anterior parts of the tongue, whether blade or tip, will be examined to reveal 
differences in constriction location and shape, which can be related to the 
primary place and method of articulation. These comparisons will be largely 
qualitative and tentative, since this is a single-speaker study, and one which uses 
a very small dataset. Our interpretative comments are based on highly accurate 
instrumental data, the nature of which we will try to convey in illustrative 
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figures below, guided by what we have found from our earlier acoustic and 
transcriptional phonetic research. 


2. Method 


2.1 Instrumentation 

Each digital ultrasound image was created from a single scan from a probe held 
by a stabilizing headset (Articulate Instruments 2008, Scobbie et al. 2008). The 
scan rate / frame rate is flexible, up to around 400 frames per second (fps), and 
was set at 100 fps (one frame each 10 ms), synchronized to the audio via the 
high-speed Articulate Assistant Advanced™ system (Articulate Instruments 
2011, Wrench & Scobbie 2011), based around an Ultrasonix SonixRP scanner 
remotely controlled via Ethernet from a PC. The transducer was a short- 
handled paediatric microconvex probe operating at 6 MHz. The field of view 
was set at 112.5". 

Acoustic data was recorded on the Articulate Instruments multichannel 
acquisition system at 22,050 Hz. In this system, a hardware pulse generated at 
the moment that each ultrasound scan is made enables accurate synchronisation 
with the acoustics. Each ultrasound frame is then stored by the AAA system 
as a set of raw echo-pulse return data (Figure 1a) from which a standard two 
dimensional image is created when viewed (Figure 1b). A semi-automatic line- 
fitting process within AAA was used to trace the location of the tongue surface. 
Figure 1a shows how an ultrasound scan samples the space in the field of view: 
pulses spread out the further they get from the probe. Each image is therefore 
less accurate in circumferential as opposed to radial dimensions. This may be by 
as much as around an order of magnitude (e.g. ~3 mm as opposed to ~0.3 mm). 
When the tongue surface is orthogonal to the echo-pulse beams, distance from 
probe data can be very accurate, but a tongue surface that is retroflexed or 
otherwise positioned so that it is approximately parallel to the beams is picked 
up less accurately, due partly to greater scattering of the ultrasonic echo, and 
partly as an artefact of image processing: the tongue surface will be detected 
at the location of each echo-pulse beam, discretising its location in little steps, 
as we will see below. A potential imaging problem is that strong retroflexion, 
where the tongue tip curls back during a sublaminal contact, may not be 
visible. Such articulations will look similar to highly apical supralaminal post- 
alveolar or palatal articulations in static images. Therefore, what appear below 
to be very retracted apicals are classified by us as retroflex. We may however 
be underestimating the extent of this retroflexion by not being able to detect 
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any sublaminal contact. A paired UTI/MRI (or UTI/EMA) dataset would be 
useful to help investigate this issue further. 
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Figure 1 — Ultrasound images of a single frame. Anterior is to right. a. The raw data return shows 
echo data from 76 echo-pulse scan-lines radiating from the probe (the curved indent, bottom 
centre). They are each 8 cm long. Bright areas result from a higher intensity of echo, which is 
particularly strong on the tongue-air boundary. b. This standard 2D image is constructed from 
this echo pulse data by AAA software, interpolated in arcs to fill in gaps between scan-lines. 


For analysis, we used the AAA software. We superimposed a measurement fan 
of 42 radial measurement lines onto images such as the one in Figure 1b (the 
42 radii being independent of the number of scan lines or the size of the field 
of view). A single control point (or “tag”) on each fanline radius was used to tie 
an analysis curve to the location of bright tongue-air boundary, if this indication 
of the tongue surface is indeed visible at all in that area of the measurement 
fan. Gradient confidence measures of this edge-tracking were available when 
automatic edge-tracking fitting was used. Hand-corrections were used as an over- 
ride if needed. Thus a tongue curve was created for each frame of interest with at 
most 42 coordinates. We chose the AAA option to smooth each curve slightly to 
avoid tracking noisy aspects of the image too closely. 

In this study, each word spoken (for materials, see below) provided a single 
representative tongue curve for one target liquid. These averaged in AAA to 
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provide a mean curve, a process which also provides the standard deviation of 
the location of the contributory tag points along each fanline. 


2.2 Speaker and protocol 

‘The male adult speaker and the second author are both (multilingual) native 
speakers of the Central Travancore dialect of Malayalam. 'The second author 
explained the materials and the orthographic conventions for the Latin script 
presentation (Table 1) orally and using Malayalam orthography. 'The speaker 
was familiar with these conventions. Latin script had to be used for software 
reasons. She also monitored data collection to make sure there were no mistakes, 
intervening when needed to elicit a correction. The speaker also corrected 
himself on a few occasions, and showed a high level of awareness of his target 
productions. Words were recorded in pairs for speed of data collection — an 
audible beep accompanied the appearance of the prompt on screen, and the 
speaker then said the word twice with a small pause in between each token. The 
speaker's bite plane was also imaged to enable rotation of images to the occlusal 
plane, and images of swallowing were captured to provide an indication of the 
location of the alveolar ridge and hard palate. 


2.3 Materials 

TAP TRILL ALVEOLAR RETROFLEX 5™ 

/t/ /1/ /V /V Lil 
ara4 aRa 8 ala 4 aLa4 azha 4 

mala 4 mazha 8 
maram 8 malam 4 
pazham 8 
ari 8 aRi 4 ali 4 aLi4 azhi 4 
ira 8 ila8 izha 8 
kara kaRa kala kala kazha 
kari kaRi kali kali kazhi 
mura mula muLa 
pura puzha 
poLa 


Table 1 — Materials, a mix of real words and pseudo-words (underlined), with a count of the 
number of tokens elicited (and analysed). Italicised words are not analysed here. 


Articulating five liquids 


‘The materials were designed to elicit minimal sets of all five liquids, which 
is possible only in intervocalic position, with multiple repetitions, within 20 
minutes to avoid speaker discomfort, and in mainly low vowel and non-lingual 
consonantal contexts to reduce coarticulation effects. The liquids were mainly 
elicited between /a/ vowels, or in an /a__i/ context, with a few other tokens in 
other contexts to provide further detail. Pseudo-words were kept to a minimum, 
and filled out the /a__i/ frame. For methodological simplicity labial consonants 
(or none) in the carrier words were preferred, to avoid unwanted lingual 
coarticulation. No carrier phrase was used, again to minimize coarticulation 
and to speed up data collection. In addition, a passage was elicited (presented in 
Malayalam orthography). This has not yet been analysed. 

At least four tokens of each (pseudo) word were captured, with 8 tokens for 
some. Ihe number of tokens elicited for words formally analysed in this paper 
is also given in Table 1 (in bold typeface). We will focus here on the core /a__a/ 
context (with no other consonant or a labial consonant) which provides most 
tokens, though, looking ahead, it is clear that words with /k/ or with other 
vowels show similar behaviour. 


2.4 Annotation and data extraction 

Figure 2 attempts to convey, in a single composite image, the way a tongue surface 
changes shape and location in time in the mid-sagittal plane, by presenting an 
overlay of a temporal sequence of tongue curves, using /r/ as an example. Each 
of these curves has been semi-automatically traced over an image like the one in 
Figure 1b using AAA software. During the actual annotation process, however, 
a time sequence of un-traced raw images was examined one at a time in a 
moving sequence, manually-controlled, from which one frame was chosen on 
a holistic basis as being the one containing a tongue shape best characterizing 
the target. Thus normally from each word only one tongue curve was drawn, on 
the target frame. It was then extracted and averaged together with shapes from 
other tokens of the same target. Plotting the average target enables an accurate 
comparison of the position and shape of each of the five liquids. 

The basis for selecting a single frame from the ultrasound sequence was as follows. 
The frame chosen (by the first author) was the one which seemed to characterise 
the consonantal constrictions as a whole, but with primacy given to the anterior 
articulation. For /|/ and /7/ the frame chosen was the one with the most strongly 
retracted and raised blade, often achieved for just a single frame or two, while the 
other three liquids were captured at a frame of a stable (alveolar) constriction. For 
the rhotics and /1/, if many frames seemed equally characteristic of the anterior 
articulation, the most extreme root articulation dictated the choice of frame. 
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Figure 2 also shows the overall orientation of the tongue shape. It has been 
rotated so that the speaker's occlusal place is horizontal (at 48 mm high). The 
location of the hard palate can be estimated by examining the images of 
swallowing, because the tongue presses up against the hard palate during 
the swallowing action. A palate shape can be superimposed on plots of the 
lingual targets in the AAA workspace (see Figure 4 below). ‘The target shape 
for the /r/ in Figure 2 corresponds to the last in the sequence plotted (i.e. 
with the highest blade). It was the curve used as the basis for calculating the 
mean tongue configuration for /r/. The /a/ curve, however, is for illustration 
— in this token it was early in the vowel, but no particular convention was 
used to identify it. Overall, Figure 2 conveys something of the nature of 
tongue movement and how segmental targets can be identified; in this case 
an intervocalic liquid. 
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Figure 2 — Example of the dynamic movement from /a/ into the tap /r/, rotated so that the 
occlusal plane is horizontal. Curves are 10 ms apart in time, and this example lasts 160 ms. 
The direction of movement in time is shown by the large arrows. The origin of the measurement 
space is arbitrary — the lower edge of the upper teeth are approximately at (110, 50). Tick 
marks are at 20 mm intervals. The same axes are used in all figures, to enable comparison of 
pharyngealisation and palatalization. 
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Figure 2 is not merely illustrative, however — it is also a useful way of conveying 
the dynamic and coarticulatory aspects of the articulation of an intervocalic 
liquid, as we will see below. 

Note that as the tongue tip raises, an air pocket may appear under the tip. This 
sub-lingual cavity is particularly important as a resonant chamber, influencing 
the acoustic properties of the liquid sounds — but unfortunately air (and hence 
the sublingual cavity) is impenetrable to ultrasound at the frequencies used in 
scanners. A raised tip is likely to be invisible to ultrasound: the corollary is 
that neither the anterior termination of the tongue surface in an ultrasound 
tongue image nor the right-hand end of the curve traced from it will necessarily 
correspond to the tongue tip itself, just to the most anterior part of the blade 
which is imageable. Moreover, when the tongue is in contact with the hard 
palate or alveolar ridge, the discernibility of the surface often diminishes. 
Finally, by measuring the distance between the curves in Figure 2 it is possible 
to estimate the speed of the tongue blade surface as it moves through the oral 
cavity. Just a single typical token of each liquid was quantified, as an indicative 
measure. We measured the speed of the blade orthogonal to the direction of 
travel — thus most segments were examined moving along a trajectory that was 
roughly vertical in these figures, whereas the forward movement of the retroflex 
flap was captured in the analysis of its motion (see Figure 9). From these tokens, 
a rough articulatory duration of ‘time at the target’ was also calculated, based 
on the number of frames where the tongue blade was moving slower than a 
threshold of 30 mm/s. 


3. Impressionistic results 


‘The impressionistic phonetic realisations of our single speaker's rhotics and 
laterals was broadly comparable to those in Punnoose (2011) and Punnoose 
et al. (2012). Primary place and manner were as expected, with the following 
exceptions: the tap /r/ was sometimes quite stop-like and sometimes fricated; 
the trill /r/ was often undershot, as seems typical for phonemic trills (Jones 
submitted); the lateral /1/ was sometimes fricated. 

In terms of secondary resonance, the /r/ and the /1/ both sounded relatively clear 
while the /r/ and the /|/ were both impressionistically darker in resonance, as 
Punnoose and colleagues have found. The 5" liquid /7/ sounded post-alveolar, 
neither strongly clear nor dark, it was often fricated, and had some mild 
rhotic qualities in the approach to maximum constriction (cf. the temporally 
asymmetrical frication and formant movements in Figure 3). 
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Figure 3 — Example token of azha /aza/ showing temporal asymmetry in its formant movement 
and frication. 


When the liquids were before /i/, mostly there were only small differences 
between /aCa/ and /aCi/: /ali/ was more consistently fricated than /ala/, but 
the number of tokens is too small to draw conclusions from. Bigger auditory 
differences from /aCa/ were observable for liquids following /i/, for the small 
dataset available. Only the clear liquids appear in this context, and all were 
fricated. 

In /ari/, as with /ara/, the tap often was stop-like, but in /ira/, it was more 
like a short fricative or a fricated tap. The alveolar /1/ in /ila/ was strongly 
fricated, probably due to the high tongue position. The /i/ in /iza/ sounded 
less peripherally high and front that the /i/ in /ira/ and /ila/, and there were 
formant differences. 


4. Articulatory results 


4.1 Single frames 

Overall, a different spatial tongue shape was found for each liquid (Figure 4), in 
addition to dynamic differences which will be explored more below. 

Bearing in mind that the tip might not have been imaged, in Figure 4 we can 
see a slightly tighter constriction between what is probably the alveolar region 
and the tip/blade for /1/ than for the other liquids — the blade is pretty parallel 
to the occlusal plane, and lies about 12 mm above it. The comparable part of 
the tongue in both rhotics appears to lie about 3 mm lower. In the tap, as with 
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/1/, the blade is pretty flat (then raises slightly in the dorsal area), whereas for 
the trill, the blade slopes ‘downwards’, about 6 mm closer to the probe. Strong 
retroflexion is clear in /|/, causing some artefacts that result in the surface 
appearing to pass through the hard palate’. The 5" liquid /z/ is clearly post- 
alveolar and laminal. 

Clear pharyngealisation can be seen in the dark /|/ and /7/, with root retraction 
of about 1 cm compared to the three other liquids. Slight raising of the front 
of the tongue, which may be a type of weak palatalization, can be seen in the 
clear /r/ and especially /1/, and also in the 5" liquid /z/ in addition to is post- 
alveolar close constriction. The 5^ liquid looks quite like a bunched tip down 
rhotic (Lawson et al. 2011). 
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Figure 4 — Averaged tongue shapes for the five liquids between /a/ vowels. Thick line indicates 
the mean, flanked by +1 s.d. The rhotics are in red, laterals in blue, 5" liquid in green. 


Similar locations and shapes can be seen for the liquids in the contexts of the 
other vowels examined, with some slight coarticulatory differences. 


1 Recall that the retroflex lateral may in fact have a sublaminal contact, but since we are unable to detect the 
‘looping back in static images, we have to ‘join the dots’ to give the impression of a supra-laminal contact. 
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4.2 Dynamic analysis 

A dynamic qualitative articulatory analysis was undertaken, and the results 
will be conveyed using typical single tokens below. A tongue curve was traced 
onto every frame from a stable vowel position before the liquid to one after it. 
The tongue roots are generally dynamically active, except in /a|a/, in which 
only the blade moved, as part of the rapid forward moving flap. Overall, the 
rhotics (/r/, /c/) and laterals (/1/, /|/) had relatively simple motion paths which 
are well-represented in the figures, but the 5" liquid /7/ had a more complex 
blade motion, which we have tried to represent with a bendy arrow. 

In each of the figures below, representing a single token, the left panel 
represents the movement from the preceding vowel to the target, and the right 
panel from the target through to the following vowel. 


-< o 1 
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Figure 5 — Example of the dynamic movement into and out of the clear tap /r/ in an /a/ context. 
The left frame shows the closing gesture from /a/ into the target frame for /r/, and the right 
frame the opening gesture from /r/ into /a/. Curves are 10 ms apart in time, and the arrow 
shows the direction of movement in time. These conventions apply to the other dynamic 
figures below. 


The slightly wider spacing of the traces in the left panel of Figure 5 near the tip, 
in the start of the tap, show the most rapid movement in the whole sequence, 
as the blade and tip move rapidly upwards. The tongue does, however, stay in 
the constriction location for a couple of frames — the shortest constriction is 
around 20 ms. 
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Figure 6 — A single token of the dynamic movement into and out of the dark trill /r/. The left 
panel includes more than one raising-lowering cycle of the blade (and tip), indicated by the 
thin arrow. 


One of the most striking things about the trill is the stability of the root in the 
/a__a/ context. In the left panel, it is perhaps not obvious that there are two trill 
events in this /r/, as the tip raises quickly to the maximum height (shown by the 
widely spaced curves), lowers by a few millimetres, and raises again. The time 
spent in this location for one constriction of the trill is short, around 30 ms. The 
up-and-down motion of the blade can be easily seen in a velocity trace (Figure 7). 
We can see rapid upwards movement of around 250 mm/s at 50 ms, followed by 
a downward-upward-downward-upward trilling motion between 75 ms-140 ms 
approximately, followed by downwards movement towards the next vowel from 
150 ms. 
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Figure 7 — Velocity upwards (positive values) and downwards (negative values) of the blade 
in trilled /ara/. 
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‘The alveolar lateral (Figure 8) is clearly palatalized, with a consistent speed of 
transition. The time spent at the constriction is around 70 ms, around twice as 
long as the other liquids. The rather ‘pointy’ palatalization may be an imaging 
artefact. 


7 0 1 
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Figure 8 — A single token of the dynamic movement into and out of the clear alveolar lateral /I/. 


‘The dark retroflex lateral, like the dark trill, shows a highly stable tongue root 
in the /a__a/ context (Figure 9) with a retraction and raising of the blade (and 
perhaps inversion of the tip)’, followed by a very rapid forward flapping motion. 
‘The time spent at the maximum retracted location is brief, around 30 ms. The 
forward motion of the tongue in the right panel shows analysis artefacts that 
could be resolved by smoothing multiple tokens: the tongue will actually be 
moving forwards evenly, but its location is only represented in the raw data 
along the echo-pulse beams giving rise to the clumping when the tongue surface 
is nearly parallel to those beams (Wrench & Scobbie 2011). 

Generally, the speed of the movement of the tongue into and out of the 
constrictions has a peak of around 100-150 mm/s (about 50 mm/s faster in 
the closing gesture than the opening gesture), except for this flap, in which we 
estimate a closing speed of 200 mm/s and a forward flapping speed of around 
400 mm/s. Only the 5" liquid moves as fast in its closing gesture. 


? It is possible, recall, that there is sublamimal contact here which we cannot represent, though we think on 


consideration that the contract is surpalaminal. 
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Figure 9 — A single token of the dynamic movement into and out of the darker retroflex lateral 
/\/. A more detailed path of movement is shown by the thin arrows. 


Finally, the 5“ liquid has rapid movement in the closing phase, with typically 
rhotic retraction and raising in a curving path, albeit weakly, and not in a 
retroflex way. 'Ihe release of the constriction is, however, rather extraordinary, 
showing a zig-zig motion that is hard to convey in these diagrams. The 
movement is, we think, indicative of a change in shape of the blade and tip of 
the tongue. It may have to do with a transition between a more grooved central 
and more lateral or slit-like airstream, retraction of the tip, or some other 
changes in lingual shape. Whatever is the source of this strange movement, 
we presume the dynamic changes are not accidental, but are associated with 
the partially rhotic and partially lateral nature of the segment. The opening 
gesture starts with a downward, slightly retracting opening, quite unlike the 
tip-down bunched /r/ seen in Lawson et al. (2011). Moreover, at the end of the 
5" liquid, in fact in the following /a/, the lowered blade extends forwards again 
without lowering more, indicating it has been previously retracted into the 
tongue body. The tip also appears to elongate forwards in /azi/ during the /i/. 
A final possibility is that this apparent motion is mainly due to the filling in 
of the midsagittal sublingual cavity by the underside of the tip and blade, and 
that the upper surface of the tongue is hardly moving forward at all. 
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Figure 10 — A single token of the dynamic movement into and out of the clear 5^ liquid /z/. The 
more detailed path of movement is shown by the thin arrows. 


Finally, here is an example ofthe dynamics of an asymmetrical environment for 
one of the liquids, in this case /r/. Tongue root advancement and palatalization 
can been seen following the trill, but the root is still very stable before the trill, 
more so in fact than the root in /ali/ where the coarticulatory influence of the 
/i/ extends more strongly across the whole of the liquid into the preceding 
vowel. 
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Figure 11 — A single token of the dynamic movement into and out of the darker trill /r/, in /ari/. The 
right panel shows palatalization (raising) and root advancement in the transition towards the /i/. 
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4.3 Coarticulation 

Some of the liquids have been shown to be clear and others dark, acoustically 
(Punnoose 2011) and in articulation (above). Punnoose also examined effect 
of vowel context on the liquids, and vice versa. With the small amount of 
data available here, we will chart changes to liquids, looking at the effect of 
coarticulation of an /i/ vowel compared to the /a__a/ context presented above. 
We will do this mainly for /aCi/, but also mixing in the small number of /iCa/ 
materials. 

‘The effects of coarticulation can be seen in Figure 12 and Figure 13 below. 
For the clear liquids (Figure 12) the tongue root is a little more advanced and 
there is some independent extra raising of the tongue into the palatal arch, 
particularly for /1/ and /r/. The 5*^ liquid may perhaps become more apical 
and alveolar in conjunction with its slight palatalization (which seems to also 
include some velarisation). 
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Figure 12 — Mean (thick lines) and standard deviation (thin) of target /z/, tap /r/ and lateral /I/ 
in/a  a/ context (solid lines) vs. mean targets from mixed /i__a/ and /a  i/ contexts (dashed). 
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The two dark liquids are shown in Figure 13. Again, the tongue root is a little 
more advanced and there is a little extra raising for /|/, but there is little change 
in /r/, which is the most stable of the consonants, as shown above in (Figure 
11). The root of the dark consonants, in its advanced state due to the influence 
of /i/, is still more retracted than the root in the clear consonants. 
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Figure 13 — Mean (thick lines) and standard deviation (thin lines) of target for trill /r/ and 
retroflex lateral /|/ in /a__a/ context (solid lines) vs. mean from pooled /i__a/ and /a i/ 
contexts (dashed). 


4.4 Summary 

We have presented a qualitative analysis of our small set of articulatory data, 
which supports the clear/dark distinctions of Punnoose (2011) and Punnoose 
et al. (2012). We confirm the nature of the distinction between the tap (which is 
sometimes rather stop-like) and trill (which can undershoot), and between the 
alveolar and retroflex lateral (which appears flap-like). We also notice unusual 
activity in the production of the 5* liquid (which otherwise appears to be a 
post-alveolar approximant cum fricative), the nature of which we have not seen 
in our previous ultrasound investigations. 
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Coarticulation to /i/ was found. Impressionistically we noted the appearance 
of audible and spectrally-visible frication in the liquid when /i/ preceded it, 
though no difference in tongue shape was apparent (but beware that the sample 
was very small). The dark trill was particularly resistant to coarticulation, and it 
may be that both it and the retroflex lateral have a more braced tongue root and 
dorsum to support their anterior articulations (see below). 


5. Previous articulatory research on Tamil and Kannada 


5.1 The 5^ liquid 

McDonough & Johnson (1997) is a single-speaker study which makes a useful 
point of reference for our work here. ‘They examined the five liquids in the 
Brahmin dialect of Tamil. This related language also has two rhotics (an 
alveolar and a retroflex flap); two laterals (an alveolar and a retroflex); and a 5 
liquid. The aim of their small-scale study was to investigate the articulatory, 
acoustic and perceptual characteristics of the five Tamil liquids, in particular 
the 5* one, in [VCV] contexts. Via electropalatography (EPG) and static 
palatography, the articulation of the Tamil 5*^ liquid was shown to involve 
tongue contact on the hard palate, as is typical for retroflex sounds. However, 
unlike /[/ there was no evidence of any forward motion during the consonant 
closure, and unlike /|/ there was no opening at the rear lateral edge of the 
palate in the EPG data. 

A difference was also found between /[/ and /|/ vs. /7/, in that the linguogram 
and EPG data showed a mid-sagittal gap in contact between the tongue and 
the palate, suggesting a dip behind the main constriction of /z/. Taken with 
their acoustic results, McDonough & Johnson describe this segment as being 
"an apical retroflex central approximant with static articulation, no laterality 
and only incidental frication" (McDonough & Johnson 1997: 22). 

In our Malayalam speaker, the 5 liquid is certainly not static in its active 
articulator, though we cannot tell how the contact patterns change, nor whether 
there is any laterality. [here appears, for our speaker, to be more friction, and the 
constriction appears laminal. 

Narayanan et al. (1999) also studied sustained productions of each of the 
Tamil liquids in a word-final /paC/ context, though the methodologies were 
rather different. The first author produced each liquid, and was recorded using 
static palatography, magnetic resonance imaging (MRI), and electromagnetic 
magnetometry (EMMA). They found the 5" liquid (in their transcription 
system it was /1/) involved an anterior tongue body articulation with the 
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narrowest constriction in the palatal region, though the exact location varied in 
an inconsistent way. The 5“ liquid’s back cavity was larger than the other liquids 
in Tamil (due to root advancement) and was ‘depressed’ around its retroflex 
constriction, as had also found by McDonough & Johnson (1997). However, 
unlike McDonough & Johnson's findings, both the 5% liquid and the retroflex 
lateral had a dynamic circular kinematic character, which is more similar to 
what was seen here for the 5% liquid and retroflex lateral in our Malayalam 
speaker, whose movement was, in addition, more complex than a simple circular 
movement. 

Taking articulatory and acoustic data together, both McDonough & Johnson 
(1997) and Narayanan et al. (1999) suggest that the 5" liquid in Tamil is a central 
retroflex approximant, therefore a third rhotic. However, their description of 
the type of retroflex articulation (static vs. back-to-front) reveals contradictory 
findings with each other (and with our observations). Furthermore, their 
findings reveal perceptual and spectral similarity between this sound and the 
retroflex lateral, which might explain some of the controversy surrounding the 
identity of the 5* liquid, although acoustic evidence was presented to argue for 
a classification of the Tamil 5*^ liquid as a rhotic. We have only discussed the 
acoustics of the speaker here briefly, but Punnoose (2011) and Punnose et al. 
(2012) explore Malayalam acoustic patterns in detail, looking at the first four 
formants of the liquids, along with their phonotactics, and conclude that both 
resonance and primary liquid manner may be equally relevant in understanding 
the system and placing the 5* liquid in it, a conclusion which suggests that 
further comparative research on these related languages would be highly 
desirable, not least because it may be unlikely that we find the same pattern in 
each. 


5.2 Retroflexion and tongue-root stability 

Such broader cross-linguistic work is under way. Recently, Kochetov et al. 
(2012) looked at (geminate) obstruents in Kannada and found that the 
retroflex stop in an/a  a/ context had a fronted tongue root compared to other 
geminate obstruents. It is not clear yet whether this is characteristic of any 
other retroflexes in Kannada, or how Kannada liquids behave, just as it is not 
clear what happens in Malayalam retroflex obstruents. From Kochetov et al.'s 
figures, we estimate that the root moves forward in Kannada from a neutral 
or /a/-like position ~300 ms before the centre of the voiceless retroflex stop by 
about 5 mm-10 mm. Contrast this with the highly stable (retracted) root in 
the Malayalam retroflex lateral above (Figure 9) in the same vowel context. In 
Malayalam there is root movement in a front vowel context — indeed the extent 
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of movement in the retroflex lateral flap in an /a__i/ context seems comparable 
to the Kannada /a__a/ context, both from /a/ forwards to /|/ (the shape charted 
in Figure 13) and then again to /i/ (Figure 14). The location of the forward- 
moving constriction has, however, the same anterior place of articulation as 
it does in /a|a/. Kochetov notes that earlier articulatory research on Tamil 
(including Narayanan et al. 1999) had also found that the pharyngeal cavity 
was wider in retroflexes than in dentals (and the neutral position). 
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Figure 14 — /ali/ from the start of the word (0 ms) with 10 ms tracings and thicker lines 
representing the shape at the time of the most retracted blade (130 ms), at the time of an 
acoustic flap event at the transition between /|/ and /i/ (220 ms) and when the target for /i/ 
was reached during the 2" vowel (300 ms). 


Since the root in Malayalam liquids seems to have a different target in clear vs. 
dark resonant liquids and, to a lesser extent, varies due to coarticulation with an 
adjacent vowel, we might expect Kannada and Tamil liquids to be similar, in a 
way that would be predictable from their acoustic resonance. 
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6. Conclusions 


‘This was a single-speaker study, with the modest aim of providing preliminary 
results which are likely mainly to raise questions and hypotheses for future 
research; but the results here do seem to fit well with the findings of Punnoose 
(2011), at least for the liquids in an intervocalic /a__a/ context, as well as with 
those of Narayannan et al. (1999) for Tamil in terms of showing a rather rhotic 
lingual articulation for the 5" liquid. The high speed ultrasound data has also 
revealed better than other articulatory techniques some of the coarticulatory 
and dynamic complexity of these sounds. Of course, we need to look at other 
speakers, materials, and styles, to get an idea of how articulation can vary, 
before we can conclude what the key elements of the system are. For example, 
for a proper phonological analysis we need to know if certain speaker groups 
rely on the clear/dark resonance difference as much as, or more than, differences 
in manner or primary place, and in what sorts of tasks and contexts. Moreover, 
we would need to check whether the presence of the ultrasound headset and 
probe, or just the recording set-up, might have affected any of the articulations 
here — this speaker seemed to have quite a lot of frication and to produce his 
taps more as short stops, and the retroflex lateral as a flap rather than as a plain 
approximant. 

One general question for future work relates to the extremely stable nature 
of the root in the trill and retroflex flap in the /a/ context (and their reduced 
but still evident stability elsewhere, i.e. in /aCi/ and /iCa/ contexts). Is this 
intrinsic stabilization of the back of the tongue in liquids ofthis type indicative 
of a very high coarticulatory resistance (Recasens & Pallarés 1999; Zharkova 
& Hewlett 2009), perhaps because bracing is required to facilitate trilling 
or other complex anterior articulation (Narayanan et al. 1999). If Recasens 
& Pallarés (1999) found high stability for the Catalan trill, using acoustic 
and electropalatography (EPG) data, then our results might confirm their 
comment that wa/ike the tap (our emphasis), the Catalan trill “involves a high 
degree of tongue body constraint" (ibid:163). On the other hand, the stability 
or not in this /a__a/ context may be a reflection of the different clear/dark 
resonances of these liquids in Malayalam. 

As a reviewer pointed out to us, these need not be antagonistic or independent 
goals. It may be the case that the characteristic dark resonances are the acoustic 
signature of a retracted and stabilized tongue body, a point argued previously 
on the basis of data on trills in Russian (Kavitskaya et al. 2009; Proctor 2011) 
and Spanish (Proctor 2011). The coarticulatory pressure from a nearby vowel, 
/a/, the language-specific characteristic of darkness, and the tendency for a 
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stabilized root in the trill and retroflex flap may all come together to create 
the immobility seen in Figures 6 & 9. More work needs to be done on the 
effect of other vowel contexts on the Malayalam liquids to work out the relative 
importance of these factors. 

From the existing ultrasound research on 4 speakers of Spanish (Proctor 2011), 
we can surmise that the trill and tap appear a/ike in showing little root movement 
in the /a__a/ context, compared to /d/ and /l/, but it is hard to be sure from 
Proctor's data's narrower field of view, in which less of the root is imaged. As 
noted above, our Malayalam data shows a clear dynamic difference between the 
dark trill and clear tap in this vowel context. Proctor focuses on the ‘dorsum’ 
or ‘body’ in the liquids, noting “during the production of the trill, in contrast 
to the obstruent, the tongue body moves up and forward — away from the 
articulatory target of the context vowel — which suggests that this movement is 
intrinsic [to /r/]" (ibid:457). He also states that dorsal advancement is seen in 
the tap and lateral. In an /e__e/ context, however, the lateral and tap are stable, 
while the dorsum retracts as the tongue moves from the preceding vowel into 
the trill. The location of this intrinsic dorsal target varies from liquid to liquid 
in Spanish: the lateral has an advanced dorsum, the trill a retracted one, and 
the tap is intermediate. We saw above (Figure 11) that the tongue root and 
dorsum in Malayalam also move, in the release of the trill towards a following 
front vowel. The targets for the liquids, as in Spanish, show coarticulation 
from the flanking vowel (Figures 12, 13). The root is more advanced next to /i/ 
in all five liquids, but while the dorsum is raised in the three clear ones and in 
the dark retroflex lateral tap, it is not raised in the trill. The trill is therefore the 
most stable liquid, and in fact this is comparable to Spanish when considering 
just/e/ and /a/ contexts: coarticulation in the trill is only evident once /u/ is taken 
into account, which we cannot do here. Proctor's quantification of coarticulation 
suggests all three liquids' dorsal constrictions vary to similar degrees, something 
else we cannot examine in our own limited data. 

Russian is interestingly different to Malayalam since it has contrastive clear / 
dark resonances. Briefly, Proctor's (2011) study of 4 speakers found the dark 
(non-palatalised) trill and lateral were more dorsally stable across different 
vowel contexts than a non-palatalised /d/, indicating greater stability in that 
area. Clear liquids also showed more dorsal stability than the comparable clear 
obstruent. In sum: the clear liquids (and obstruent) had a palatal ('anterior 
dorsal’) target; the non-palatalised /1/ had a uvular-pharyngeal target; and the 
non-palatalised trill was rather intermediate with a backish target. 

In general, if we could get more articulatory data to augment the mid-sagittal 
view, it would be particularly useful for understanding the laterals and the 
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5* liquid. In the AAA multi-channel system it would be easy to add 
synchronized lip video (60 fps) and EPG (200 fps), which can be captured 
simultaneously with hs-UTI. EPG gives excellent spatio-temporal information 
on anterior tongue-palate contact for taps and trills, as shown by Recasens & 
Pallarés (1999), and centre-only contact to indicate the presence of laterals (cf. 
Scobbie & Pouplier 2010 for English), and both these studies also show how 
EPG can be used to detect secondary articulation. 

Additionally, it would be very useful to get some coronal section data from 
ultrasound, in an attempt to understand how the tongue surface of the tip and 
blade deforms, stretches and moves to enable lateral airstream(s) and hence 
alter the resonance characteristics of the complex oral cavity tube(s). However, 
as with the data here, the sublingual cavity would prevent us gaining a complete 
view of the tongue tip, and coronal scans cannot be made simultaneously with 
the mid-sagittal scans with current equipment. A high-speed MRI system 
might also be able to capture the relevant articulation. 

We think such extra information would be particularly useful for understanding 
Malayalam’s 5" liquid. The tongue surface data we have seen, augmented by our 
visual inspection of tongue-internal features, suggests that there are volumetric 
changes in the tongue blade that the mid-sagittal curves simply do not capture 
well. The tip seems to extend forward during the following vowel, suggesting 
it has been retracted during the liquid. While a flesh-point tracking technique 
like EMA (electromagnetic articulography) would be very useful in showing 
how the blade and tip upper surface might be extending, it would be harder 
to get dynamic data on how the tip might be thinning or lowering laterally, 
without, that is, the challenge of fixing a coil to the sensitive sublingual surface. 
Finally, it may be the case that a categorical analysis of the 5* liquid as either a 
rhotic or lateral liquid is not desirable, phonologically. It is, after all, an ambiguous 
segment. Its phonetic characteristics would not seem to be predicted by a simple 
formal phonological classification of this segment as being ‘rhotic’, or not. We 
do of course need more articulatory data. If this 5® liquid does indeed have the 
unusual kinematic properties which are suggested here, i.e. a movement path 
which we have not observed in typical rhotics or laterals, this would perhaps 
help to explain its ambiguous status in Malayalam’s large liquid set. 
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Abstract 

Acoustic and articulatory (EPG) examination of the Greek rhotic in several prosodic 
positions (singleton phrase initially, word initially and word medially, also in /Cr!/ 
clusters and /rC/ sequences) revealed a single constriction of short duration suggesting a 
tap articulation. This contained a vocalic part in /Cr/ and /rC/ contexts, but interestingly, 
also in phrase initial position when the rhotic was followed by a vowel. The constriction 
phase had a fairly stable duration and was shorter than the vocalic part, whose duration 
depended on prosodic position and context: it was longest phrase initially, next longest 
in /rC/ sequences and shortest in /Cr/ clusters. Finally, the vocalic interval's formant 
structure was typically similar to that of the nuclear vowel, but with more centralized 
formant values. We hypothesize a vocalic gesture upon which the rhotic is superimposed. 
Articulatorily, the place and degree of constriction of the tap varied as a function of 


prosodic position, context and speaker. 


1. Introduction 


‘The phonetic variability of rhotics across and within languages has been noted 
repeatedly (e.g. Lindau 1985; Ladefoged & Maddieson 1996; Catford 2001). 
‘This variability in realization has been the sole subject of the r-atics conference, 
now in its 34 occurrence, contributing a large body of evidence on the many faces 
of /r/ in language after language (e.g. Demolin 2001; Docherty & Foulkes 2001, 
among others). Apart from sociolinguistic context, variation has been reported, 
from a more phonetic viewpoint, as a function of phonetic context, prosodic 
position and speech rate (Lindau 1985; Inouye 1995; Recasens & Espinosa 
2007). This paper follows the phonetic-oriented rather the sociolinguistic 
methodology in reporting on the Greek rhotic variability as spoken in Standard 
Modern Greek. 


1 — Throughout the paper, the symbol /r/ is used for the Greek rhotic for practical reasons. 
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In the Greek literature, recent laboratory studies describe the rhotic as a tap in 
intervocalic position (Nicolaidis 2001; Baltazani 2005, 2009) or in initial and 
intervocalic position (Arvaniti 1999). All studies have reported considerable 
variability in its acoustic and articulatory characteristics. Both a tap and an 
approximant realization have been observed (Nicolaidis 2001; Baltazani 2005, 
2009) and its place of articulation has been reported to vary across alveolar, 
retracted alveolar, and postalveolar positions (Nicolaidis 2001). 

‘The most recent studies have documented the presence of a vocoid between the rhotic 
and the consonant in /Cr/ clusters and /rC/ sequences, and more interestingly, in 
phrase initial position when /1/ is followed by a vowel (Nicolaidis & Baltazani 2011, 
2013; Baltazani & Nicolaidis 2013, collectively referred to henceforth as N&B). 
While no other study, to our knowledge, has reported a vocoid accompanying 
a singleton /r/ in phrase initial position in other languages, several studies have 
detected a vocoid in /Cr/ clusters and /rC/ sequences, in Catalan, several Spanish 
dialects, in Romanian, and Hungarian (e.g. Bradley & Schmeiser 2003; Bradley 
2004; Recasens & Espinosa 2007; Vago & Gósy 2007; Savu 2013). 

Arvaniti (2007) claims that this more complex articulation of /r/ in Greek 
indicates trill production in clusters while Baltazani (2005, 2009) interprets it as 
a tap with a vowel-like transition. The electropalatographic data reported in N&B 
typically show one constriction present, providing evidence of a tap articulation. 
‘There are two types of cross-linguistic accounts for /r/, especially in consonant 
clusters. Both assume that the vocoid is part of the nuclear vowel which underlies 
the whole syllable and is briefly exposed between the consonants: one accounts 
for this as the result of gestural overlap between the two consonantal gestures 
(Romero 1996; Bradley 2004; Recasens & Espinosa 2007) and the other, in a 
slightly different vein, hypothesizes that the unmasking is due to the tongue 
movement trajectory of the tap, which cocks back to gain momentum before 
tapping (Inouye 1995). On the other hand, Blecua (2001) argues that the vocoid is 
an inherent part of the rhotic based on the observation that the formant structure 
of the vocoid is similar but not identical to that of the tautosyllabic vowel. 

'Ihe former two types of account mentioned above are not supported by our 
results which document a vocoid even for /r/ in vocalic environments, e.g. in 
phrase initial position (##rV ). Instead, in line with literature on coarticulation 
(e.g. Ohman 1966), we hypothesize that the vocalic gesture is an integral part 
of the rhotic upon which the tap constriction is superimposed. 

This study compares the rhotic acoustic and articulatory realization across 
positions explored in previous studies in N&B. It attempts a synthesis of the 
previous results, offering a unified interpretation of the Greek rhotic production 
on the basis of an analysis that studies the rhotic across several prosodic positions 
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using a consistent experimental and methodological design. It addresses two 
main issues: first, the effect of context and prosodic position on the rhotic 
duration, articulation and the vocoid formant structure; second, on a more 
theoretical level, an explanation for the vocoid based on our empirical data. 


2. Our experimental data 


2.1 Method 


The N&B experiments examined /r/ in real words, where possible, in the 
environment of all five Greek vowels, /i, e, a, o, u/^; test words were up to four 
syllables long. Five adult speakers, AT, TP (male) and MM, KN, RP (female), 
repeated the material five times at a comfortable speaking rate. Apart from 
phrase initial position, where the test word was uttered in isolation, test words 
were embedded in the carrier phrase [i 'leksi ‘ine ___a'pli] "Ihe word _ is simple’. 
We examined /r/ in five positions: phrase initial (/##rV/), word initial within 
a phrase (/i#rV/, henceforth ‘word initial’), word-internal intervocalic (/arV/, 
‘intervocalic’), in /Cr/ clusters and in /rC/ sequences (henceforth 'C-contexts' 
will refer to both /Cr/ and /rC/ unless only one of them is discussed). C-contexts 
contained symmetrical VCrV and VrCV sequences, with C = /p, t, k, f, 0, x/. In 
singleton /r/ conditions the /rV/ syllable was stressed but C-contexts words had 
variable stress. The cross experiment total was 1875 tokens. 

In all experiments we simultaneously collected acoustic and EPG data using the 
British EPG system marketed by Articulate Instruments. The artificial palate 
used in this system has 62 electrodes on its surface, which are distributed in 
eight rows. The front four correspond to the alveolar zone, which is further 
subdivided to the alveolar region (rows 1 to 2) and the postalveolar region 
(rows 3 to 4). The back four rows of electrodes correspond to the palatal zone 
(Recasens et al. 1993). In addition, a separate recording of acoustic data was 
made on a digital recorder (Marantz PMD 660) with a Rode NT1-A cardiod 
condenser microphone. Acoustic data were analysed using PRAAT. 

We measured the durations of the rhotic constriction phase and of the vocoid, 
as well as the F1 and F2 formants of the vocoid and of the flanking vowel(s) to 
detect possible environment influences on the vocoid. Ihe onset of the constriction 
phase — together with the onset of the voicebar — was marked at the offset of 
silence, preceding vowel or vocoid, depending on prosodic position. The offset of 
constriction was marked at the beginning of the formants for the following vowel 
or vocoid. The beginning and end of the vocoid was marked at the onset and offset 


? For a description of the Greek vowels, see Arvaniti (1999, 2007). 
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of its formant structure respectively (see e.g. Figures 8 and 10). The duration and 
formant measurements were automatically obtained through a PRAAT script. 
For the articulatory analysis, the first EPG frame of maximum contact/constriction 
in the four front rows was annotated (Figure 1a, b) as constriction always occurred in 
the alveolar zone. The frame of maximum contact typically coincided with the frame 
of maximum constriction; in the few instances that it did not, the frame of maximum 
constriction was annotated. Ihe percentage frequency of electrode activation of the 
entire palate, i.e. all eight rows, over five repetitions was then calculated at the frame 
of maximum contact/constriction for the rhotic in each test word. 


2.2 Results 


2.2.1 Articulatory results 

The articulatory analysis showed that the Greek rhotic is produced with a single 
constriction of short duration, both in C-contexts and in singleton /r/ positions 
suggesting a tap articulation (Figures 1a, b, 8). Some tokens involving trill production 
were found but they were very few across contexts/positions (for details see N&B). 


FEET Fen [gee Fat | Cay a | Coe 


Figure 1(a, b) — Acoustic and electropalatographic data for the rhotic in [le'pres] above and 
['fortos] below (speaker TP). The annotation line corresponds to the first frame of maximum 
contact/constriction in the alveolar zone and the corresponding palatogram is shown at the 
top right of the display. A single tap gesture is evident for the rhotic in both tokens (see also 
palatograms and contact totals displays below the spectrograms). 
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However, variability in the articulation of the tap was evident across the data 
as there were tokens with complete constriction and tokens with incomplete 
constriction. The latter ranged from very constricted to very open articulations. 
‘These patterns related to variability in the acoustic signal. For tokens with 
complete constriction, there was evidence of a stop-like pattern frequently with 
a burst present (Figure 2a). For tokens with incomplete constriction, undershoot 
was manifested variously: a stop-like pattern but with no abrupt discontinuity at 
release, i.e. no burst (Figure 2b), noise/breathiness during constriction (Figure 2c), 
or formant structure indicating approximant production of the rhotic (Figure 2d). 
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Figure 2(a-d) — Differences in the degree of constriction of the rhotic. Complete constriction 
in [ma'rika] (a) and incomplete constriction in [ma'ruli], ['rama] and [ma'rika] (b, c, d), and 
variation in the acoustic signal (see text for details). 


The degree of constriction was influenced by several factors. First, an effect 
of singleton vs. cluster/sequence production was found, as most tokens with 
incomplete constriction were found for singleton /r/ (63%, 236 out of 375). 
Second, there were more tokens with incomplete constriction in heterosyllabic 
/1C/ (5796, 426 out of 749) than tautosyllabic /Cr/ contexts (4796, 351 out of 
748). 

Third, for singleton /r/, prosodic position had an effect on degree of constriction. 
More tokens with incomplete constriction were present for word initial position, 
i.e. 7896 in comparison to 5796 for phrase initial and 54% for word medial (Table 
1, see also Figure 6). 

Finally, for /r/ in C-contexts, overall more tokens were produced with incom- 
plete constriction in the context of a fricative compared to a stop, i.e. /fricat- 
ive-r/ 4996 and /r-fricative/ 6796 compared to /stop-r/ 4496 and /r-stop/ 4796 
(see Figure 5; note speaker variation in Table 1). 

Table 1 presents the numbers of tokens produced with incomplete constriction 
for singleton /r/ and C-contexts for all speakers. In addition to the variation 
noted above, large speaker variability is evident. For instance, for speakers KN, 
AT and RP more productions involved incomplete constriction systematically 
across conditions compared to MM and TP. This suggests different speaker 
strategies in rhotic production. 


The many faces of /r/ 


SINGLETON /r/ KN AT RP MM TP 

Phrase initial 14 21 21 7 8 

Word initial 23 25 22 17 11 

Word initial 18 17 21 8 d 

Total 55 63 64 32 22 375 
/Cr/ cLUsTERS 

Stop-r 48 53 29 24 12 

Fricative-r 54 43 51 29 8 

Total 102 96 80 53 20 748 
/rC/ SEQUENCES 

r-Stop 59 41 34 14 28 

r-Fricative 71 62 48 30 39 

Total 130 103 82 44 67 749 


Table 1 — Number of tokens showing incomplete constriction for singleton /r/ and C-contexts. 


‘These values should be considered with caution, as it is possible that complete 
contact may have not been registered for some tokens due to the sampling 
rate of the EPG system (10 ms). Observation of the EPG and acoustic data 
indicates that, if this has occurred, it involves a very limited portion of the data 
(tokens involving very constricted productions) as there were clear differences in 
the acoustic waveform among tokens produced with complete and incomplete 
constriction. Further analysis can estimate such cases more precisely. Such 
a shortcoming is expected to affect /r/ production to a similar degree in all 
contexts, as it is random. Thus although it may result in less accurate absolute 
values, it is not expected to affect the accuracy of the differences reported across 
conditions. 

With reference to the place of articulation of the rhotic, the constriction location 
in the alveolar zone, i.e. the four front rows of electrodes, was found to vary as a 
function of context, prosodic position and speaker. 

Figure 3 illustrates the influence of the vocalic context on the place of articulation 
of the rhotic. Overall, more advanced production was evident in the front vowel 
contexts /i, e/. More retracted articulation was generally present in the rest of 
the contexts with several tokens showing greatest retraction in the context of 
/a/ and/or /o/. The data showed therefore articulation in the alveolar zone but 
the precise place of rhotic articulation varies from alveolar, retracted alveolar, 
advanced postalveolar to postalveolar depending on the vocalic context. 
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Figure 3 - EPG palatograms displaying percentage frequency of electrode activation over 
five repetitions during the production of the /r/ in word-internal intervocalic position (top) by 


speaker AT, in /Cr/ clusters (middle) and /rC/ sequences (bottom) by speaker MM. 


'Ihe consonantal context also influenced /r/ production in C-contexts. More 
fronted production was overall evident in the context of the dentals (Figure 4). 
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Figure 4 — Production of /r/ in /Or/and /xr/ clusters by speaker RP. 


As noted above, context also had an effect on the degree of /r/ constriction 


in C-contexts. Overall, more open articulations were present in the context of 


fricatives (Figure 5). 
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Figure 5 — Production of /r/ in /rt/and /r8/ clusters by speaker KN. 


Figure 6 illustrates variation in the degree of constriction for singleton /r/ 
in phrase initial and word initial position. As noted previously, more open 


productions were evident in the latter position. 
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Figure 6 — Production of /r/ in phrase initial (top) and word initial position (bottom) by speaker KN. 


Finally, the speaker was an important source of variation. Inter- and intra-speaker 


differences in degree of contact, place of articulation and degree of constriction 
were found. Figure 7 illustrates such differences: /r/ is produced at a more 


retracted place of articulation, with more instances of incomplete constriction 


and greater amount of contact in the palatal zone by speaker RP compared to TP. 


and TP (bottom). 
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2.2.2 Acoustic results 
In C-contexts, but more importantly phrase initially where /r/ is not part of a 
cluster, the rhotic structure typically involves a vocoid (Figure 8). 


0.2545- 


1— 3 pls 


0.2599 


Time (5) 


Figure 8 — Phrase initial tap in ['rama]. Notice the long vocoid duration (60 ms). 


The vocoid was clearly evident in phrase initial position where it had the longest 
duration, while in word initial and intervocalic positions, it was not as easy to 
discern due to the flanking vowel environment. Thus in these last two positions 
no measurements were made. However, acoustic evidence, like discontinuities 
and/or an abrupt change in amplitude and formants during the pre-rhotic 
vowel (V1) (Figure 9), suggest the presence of a vocoid adjacent to V1 and 
to some degree overlapping with it even in these positions (cf. Savu 2013 
for similar evidence in /VrV/ contexts in Romanian; Willis 2006 for another 
interpretation of these acoustic characteristics). For more details see Baltazani 
& Nicolaidis (2013). A possible alternative interpretation may account for such 
acoustic manifestations during the vowel as solely resulting from coarticulatory 
influence from the flanking vowel. Still it is interesting to note that there are 
frequently abrupt discontinuities present resulting in a vocalic interval that is 
relatively separate and of a remarkably similar duration to the vocoids found in 
other prosodic positions. 
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Figure 9 — Acoustic evidence for a vocoid in intervocalic position. The last 1/3 of V1 in [ma'rika] 
shows a discontinuity and a change in formants. 


The acoustic measurements revealed variability in the vocoid production, 
which ranged from a modal vowel to a breathy/whispered one (top and bottom 
of Figure 10). Finally, there was a tendency for more tokens with whispered/ 
breathy vocoids or frication noise during the constriction phase in heterosyllabic 
/rC/ sequences than in tautosyllabic /Cr/ clusters. This suggests more 
assimilatory effects of the following voiceless obstruent in /rC/ sequences. 
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Figure 10 — Modal vowel quality for the vocoid in [a'frato] (top) and breathy vowel quality in 
['ergete], (bottom). 
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A comparison across positions shows that the vocoid has longer average duration 
than the constriction (Figure 11). Similar results have been found for Spanish 
(Bradley & Schmeiser 2003). 


UPhr. Init 


Wd. Init 


vocoid constriction = InterVoc 


Figure 11 — Vocoid and constriction duration in different positions. Note that constriction 
duration was measured in all contexts, while the vocoid duration was measured in phrase 
initial and C-contexts only. 


Furthermore, among the positions where the vocoid duration could be 
measured, shown in Figure 11, the longest occurred phrase initially, almost 
twice as long as that in /Cr/ clusters and considerably longer than that in /rC/ 
sequences. We attribute the long vocoid duration in phrase initial duration to 
the effect of initial strengthening. Differences in vocoid duration in C-contexts 
are attributed to differences in syllabic affiliation, as /Cr/ are tautosyllabic and 
/rC/ heterosyllabic and thus the spatio-temporal coordination of gestures may 
differ in the latter (see also Recasens & Espinosa (2007) for a review of similar 
findings for vocoid duration in /Cr/ and /rC/ contexts in Spanish and Catalan). 
‘The consonantal constriction was longest for word initial and /rC/ sequences 
and shorter for phrase initial, intervocalic and /Cr/ clusters (Figure 11). Note, 
furthermore, that, unlike the vocoid, the differences in the duration of the 
constriction across prosodic positions are small, with only 8.5 ms difference 
between the average longest and shortest duration. 

These comparisons indicate that the different positions/contexts exert an 
asymmetric influence on the two components of the rhotic. One possible 
reason for such asymmetries relates to their articulatory nature. The tap, which 
has been described as a short ballistic gesture in the literature (Lindau 1985; 
Ladefoged & Maddieson 1996; Recasens & Espinosa 2007), is not as free to 
lengthen as the vocoid. 

A comparison of the vocoid quality in the singleton vs. C-contexts revealed 
that across prosodic positions the vocoid formants (measured in Hz) are similar 
to those of the nuclear tautosyllabic vowel and somewhat more centralized 
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(Figure 12). The amount of centralization varied across prosodic positions, 
vowels and gender. In phrase initial position female speakers showed a smaller 
degree ofcentralization than males, while in /Cr/ clusters the opposite trend was 
observed. In /rC/ sequences, on the other hand, the amount of centralization 
was relatively similar across genders. On the whole, centralization was more 
pronounced across genders and vowels for /rC/ sequences, probably because 
/rC/ sequences are heterosyllabic. 
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Figure 12 — Comparison of vocoid formants (in Hz) to the nuclear V in different contexts for 
male (left panels) and female speakers (right). 
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Figure 13 shows considerable variability in the Euclidean distance between the 
vocoid and the nuclear vowel across speakers and vowels. On average, across 
vocalic environments, the vocoid in /Cr/ clusters has the closest formant values 
to the nuclear vowel. 
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Figure 13 — The Euclidean distance between the vocoid and the nuclear vowel across 
Speakers and vocalic contexts. 


3. Discussion 


Across contexts, articulation of the rhotic typically involved one short constriction 
(ranging between 11-57 ms) suggesting a tap articulation. Similar durations for 
taps have been reported for several languages previously (see Recasens & Espinosa 
2007 and references therein). While single contact trills have also been reported 
before, in the case of utterance initial position they typically involve a much longer 
constriction phase (around 100 ms) than the one reported in the present study (see 
Recasens & Espinosa 2007). An interesting finding of the research reported in this 
study, is the presence of a vocoid during rhotic production in different contexts. On 
the basis of this finding, two main questions addressed in this paper are: “is there 
a vocoid present in all different contexts?" and “why is there a vocoid?". While the 
presence ofthe vocoid has been documented in C-contexts before, useful insights can 
be gained from the study of phrase-initial /r/ where /r/ articulation is not affected by 
an adjacent consonant. Establishing the existence of a vocoid before the constriction 
phase corroborates the view of the vocoid as an essential articulatory component of 
the rhotic in singleton contexts. If phrase initial /r/ is the ‘canonical’ production, 
then the vocoid can be explained along the same articulatory principles in other 
contexts. The evidence above, together with the indications provided for a vocoid 
in word initial and intervocalic position manifested through abrupt discontinuities 
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in amplitude and formant structure, cast doubt on an exclusively gestural overlap 
account (see section 1) since a vocoid is attested even without another consonant 
adjacent to the /r/. Instead, we propose that the rhotic is superimposed on a rhotic- 
specific vocalic gesture, which is necessary for the execution of the ballistic gesture 
(cf. Blecua 2001), i.e. the brevity/ballistic nature of the tap gesture requires an 
undelying vocalic gesture for its execution. Coarticulatory effects are expected in 
different contexts, which can account for the spatial and temporal variability present 
during the vocoid and constriction phases of the rhotic. 

Further corroborating evidence for our proposal can be found in word-final /r/ which 
is produced with a vocoid after the constriction (Stolarski 2011 for Polish; Recasens 
& Espinosa 2007; but see Romero 2008 for a gestural coordination account). 
Figure 14 shows at the top panel the palindromic word [re ver], as produced in the 
phrase [re'ver mu] “my cuffs" by the second author. The word-final vocoid is clear 
before the segment /m/ (the initial vocoid and constriction duration is 48 ms and 
19 ms respectively; the final vocoid and constriction duration is 26 ms and 23 ms 
respectively). Ihe bottom panel in Figure 14 shows /r/ produced in isolation with 
one vocoid on either side of the constriction; the initial and final vocoid duration is 
36 ms and 54 ms respectively while the constriction itself, which ends with a burst 
followed by frication noise, lasts 28 ms (cf. Stolarski 2011 for Polish /CrC/ clusters). 
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Figure 14 — Top: mirror images of vocoid+constriction in [re'ver]. Bottom: /r/ in isolation. 
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In line with the above interpretation, i.e. that the rhotic is superimposed on 
a rhotic-specific vocalic gesture, the variation observed in the position of the 
vocoid in relation to the consonantal context in /Cr/ vs. /rC/ sequences is 
expected and can be uniformly explained. If the rhotic is superimposed on a 
vocalic gesture then the vocoid is expected to precede the rhotic constriction 
in /CrV/ sequences and follow it in /VrC/ contexts. 

Furthermore, the formant structure of the vocoid was more centralised than 
the nuclear vowel, which was an expected outcome: the V-to-V gesture upon 
which the rhotic is superimposed includes a vocoid which is influenced through 
V-to-V coarticulation by the nuclear vowel in to different degrees depending 
on the context (singleton, C-context). The influence of the adjacent vowel, 
especially in C-contexts, has been documented for other languages as well 
(e.g. Blecua 2001; Ramírez 2006). 

More specifically, there was a difference between the heterosyllabic /rC/ 
sequences and all the other prosodic positions: in /rC/ sequences, which lack 
syllable coherence, both the vocoid and the constriction are longer than in 
/Cr/ clusters and the vocoid formants are more centralized suggesting less 
temporal compression and reduced spatial V-to-V overlap. However, there is 
C-to-r anticipatory coarticulatory influence across the vocoid both in place and 
degree of constriction. Interestingly, despite the longer vocoid and constriction 
duration, there were more tokens with incomplete constriction than in /Cr/ 
clusters. More C-to-r anticipatory than carryover effects, i.e. more tokens with 
incomplete constriction in /rC/ than in /Cr/ contexts, may relate to the more 
centralized quality of the vocoid in /rC/ sequences. 

On the other hand, the longest vocoid duration was observed for the singleton 
rhotic phrase initially and the shortest for /Cr/ clusters. These findings can be 
interpreted as initial strengthening for the rhotic in phrase initial position, 
realised temporally in the vocoid but not in the constriction duration. The 
shortest vocoid and short constriction duration were found in tautosyllabic 
/Cr/ clusters suggesting temporal compression due to the closer co-ordination 
relations. Carryover C-to-r effects were also found across the vocoid affecting 
both the place and degree of constriction of the rhotic. 

Our data showed variation in place and degree of constriction, duration 
and vocoid formants as a function of speaker, context and prosodic 
position. In addition, the vocoid was typically longer than the constriction. 
While the vocoid length showed considerable variation as a function of 
prosodic position and context, smaller differences were found for the 
constriction, something we interpret as lack of freedom for lengthening 
the tap constriction. 
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Across experiments, more than 5096 ofthe tokens were produced with incomplete 
constriction, ranging from very constricted to very open articulations. A smaller 
percentage of productions with incomplete constriction was found in C-contexts 
than for singleton /1/, which suggests influence from the consonantal context. 
Interestingly, more tokens with reduced contact were found in word initial and 
/rC/ sequences where the constriction is longer. For the former, this suggests 
that more factors, in addition to boundary strength, regulate the amount of 
contact. In particular, more tokens with incomplete constriction in word-initial 
than word-medial position may relate to contextual influence and related 
gestural coordination patterns, i.e. word-initial tokens were preceded by the 
high vowel /i/ of the word ‘/eksi’in the carrier phrase while word-medial rhotics 
were preceded by the open vowel /a/. A more open tongue position during /a/ 
may allow for a more complete ballistic gesture reaching the target for the 
tap. Note that difficulty in attaining closure during taps in the environment 
of a following /i/ has been reported in Recasens & Espinosa (2007) due to 
the nature of the gestures involved. More investigation is necessary for a 
comprehensive account of spatio-temporal variation. 

Finally, the results on the contextual influence, in particular, V-to-r and 
C-to-r effects, indicate that the tongue coarticulates with neighbouring 
gestures during the production of the rhotic in Greek, in line with evidence 
from other languages (e.g. Recasens 1991). While the analysis presented has 
aimed towards a uniform explanation of /r/ production, it should be noted that 
further work is needed so that current and alternative interpretations can be 
tested and firm conclusions can be drawn. This includes statistical analyses of 
the different measures across positions and further qualitative analyses. 'These 
are currently underway. 
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Another look at the structure of [r]: 
Constricted intervals and vocalic elements 


Carmen-Florina Savu, University of Bucharest 


Abstract 

‘This study investigates the hypothesis that the rhotic segment containing one constricted 
interval, [r], has a more complex internal phonetic structure that includes vocalic 
elements flanking the constriction, as suggested in classic studies, as well as more recent 
ones (Stolarski 2011 and references cited therein). The current experiment focuses on 
the quality of the vocalic elements of the sound in Romanian (contexts #rV, Cr, rC) and 
the acoustic analysis shows them to systematically stay mid-high and central (to front) 
across contexts. The paper also briefly touches on a phonological implication of this 
structure of the tap. 


1. Previous studies: Putting contexts together 


[r] is described as the sound involving “a fast, ballistic tongue-tip raising 
movement and a single, short apicoalveolar contact" (Recasens & Espinosa 
2007:1). When the segment is in intervocalic position (context VrV), this is 
seen as a very brief constricted interval on a spectrogram. 

This paper argues for the claim that the tap actually contains two vocalic 
elements, one on each side of this constricted interval, as pointed out in classic 
studies by Polish authors and recently maintained in newer ones (see Stolarski 
2011 and others this author cites). Thus, I aim to show that the tap structure is 
actually *vocoid-constriction-vocoid'. 

Studies indicate that when [r] is bordered by a consonant on one side, while 
having a vowel on the other side, spectrograms show a vowel-like element 
intervening between the consonant and the constricted interval of the tap. This 
means that a vocoid appears to the left of the constriction in Cr, and to the 
right of it in rC. The phenomenon is consistent cross-linguistically for clusters 
(V)CrV and VrC(V) (see Avram 1993; Ramirez 2006; Baltazani 2009, and 
others). 
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The appearance of the vocoids has also been reported where [r] has a word- 
boundary (pause) on one of its sides instead of a consonant (see Vago & Gósy 
2007, among others). The vocoid is positioned between the pause and the 
constriction of the tap, in word-initial and word-final /r/ (contexts #rV, Vr#). 
For example, a word that begins with [r] actually begins with the vocoid (see 
Figure 3 below). 

‘This vocalic element has received various interpretations. Ramírez (2006) labels 
it “epenthetic”, though its systematic appearance across languages and contexts 
suggests that this is not the case. 

Schmeiser (2009) prefers the term "intrusive vowel" because, from a synchronic 
point of view!, this vocoid does not add an extra syllable to the word, which is 
what happens with vowel epenthesis. This may be another argument against 
considering the vocalic elements epenthetic. Bradley & Schmeiser (2003) 
explain the appearance of this “intrusive vowel" as the result of a less than 
maximal overlap between the two articulatory gestures performed to produce 
the tap and the adjacent consonant. While this explanation could account for 
Cr and rC clusters, it does not account for the #rV and Vr# cases, where there is 
no other consonant in the immediate vicinity of the tap. In these cases it would 
be difficult to consider the vocoid as an effect of the gestural transition from one 
consonant to the next. 

Avram (1993) and Baltazani (2009) regard it as part of another realization of 
/1/, different from the intervocalic tap. Note that, under this view, we would be 
dealing with four realizations of the rhotic: one in intervocalic position, where it 
is just a short constriction, another in Cr and #rV, containing a constriction and 
a vocoid to its left, another realization for contexts rC and Vr# (a constriction 
and a vocoid to its right), and yet another for contexts Cr£, #rC and CrC 
(presented below), with two vocoids flanking a constriction. 

None of these interpretations considers the vocalic element as part of the tap 
proper. In what follows I attempt to show that the vocalic element observed 
in Cr, rC, #rV and Vr# is one of the two vocalic parts a tap normally contains. 
‘Thus, I attempt to unify the contexts described above, with seemingly unrelated 
phenomena, and argue that, when considered globally, they lead to just one 
realization of the tap. 

Slavic data indicate two vocoids, one on either side of the constriction when [r] 
does not border with a vowel at all, but only consonants or pauses. We have the 
opportunity to see this in the rarer contexts #rC, CrC, Cr£, which I consider to 


! An anonymous reviewer points out that diachronically, these vocoids may add syllables to the word or, on 


the contrary, the reverse may happen. Indeed, this is an interesting instance of reanalysis of (part of) the 
vocoids as full vowels, or, in the reverse case, full vowels may be reanalyzed as parts of the tap. For reasons 
of space I do not elaborate on this topic here, but the interested reader may consult Savu (2012). 
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be the most important piece of the puzzle. The two vocoids appear in syllabic /1/ 
in Serbo-Croatian and Slovak (see Gudurié & Petrovié 2005 and Pavlík 2008 
respectively), as well as non-syllabic /r/ in Polish (see Stolarski 2011). Figures 1 
and 2 below illustrate two examples. 


di iE durum 

lius c CORE 8 
TELS Ime 

[ f dz a ] 


Figure 1 - The Polish word raza ‘rust’, non-syllabic [r] in #rC (from Stolarski 201 1). 
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[n a v r fi] 


Figure 2 - The Slovak word navrh ‘proposal’, syllabic [r] in context CrC (from Pavlik 2008). 


What the data appear to suggest, when taking all the contexts into consideration, 
is that the tap's structure may include vocalic elements on both sides of the 
constriction. They are clearly delimited and salient on the side(s) where it 
does not border with a full vowel. Thus, CrC, #rC and Cr£ show both vocoids 
because the consonants or pauses flanking the rhotic contrast with the vocoids 
and emphasize them. Cr, rC, #rV and Vr£ show only one vocalic element, either 
on the left or on the right, depending on where the consonant or pause which 
renders the vocoid salient occurs. VrV would show only the constriction, the tap 
having nuclear vowels on both sides for the tap's vocoids to ‘melt into’. Therefore, 
under this view, the structure ofthe tap is always the same: 'vocoid-constriction- 
vocoid', and the phonetic context reveals or hides different parts of it. 
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2. The experiment on [r] in Romanian 


2.1 Purposes 

The main purpose of the current experiment is to measure the formant structure 
of the vocalic elements of the tap in order to determine how much their quality 
can vary. Another aim is to measure the mean duration of the salient vocoids 
and the constrictions in Romanian words. A third aim is to investigate the 
possibility of the structure argued for in Section 1 being detectable in context 


VrV as well. 


2.2 Setup 

Recordings of Romanian words in isolation were made, containing /r/ in 
contexts #rV, (V)CrV, VrC(V) and VrV, where C is a stop (/p, t, k, b, d, g/) 
and V is one of the seven vowels of Romanian (/a, e, i, o, u, o, i/). Additionally, 
recordings of nonsense VrV sequences and sustained tokens of each Romanian 
vowel were obtained from each speaker. Clusters Cr and rC were flanked by 
either the same vowel on both sides, or by a vowel and a word-boundary. Ihe 
idea behind this is to have the tap in the immediate vicinity of only one vowel, 
so as not to have it influenced by two vowels of different qualities at once. This 
gives the vocalic part the opportunity to have a quality as similar as possible 
to that of the vowel that is in its vicinity. For example, the way to find out 
how much the vocalic element can approach the quality of [1] is to include the 
sequences /iCri/ or /#Cri/, rather than /aCri/. Examples of words used in the 
experiment are given below: 


Cr: — /abra'ziv/ ‘abrasive’ 
/gra'dina/ ‘garden’ 
rC: — /kor'don/ ‘belt’ 
/ tirg/ ‘bazaar’ 
#rV: /radu/ proper name 
VrV: /pe'rete/ ‘wall 


The 5 participants (4 female, 1 male) read the words and sequences off PowerPoint 
slides 4 seconds apart? and the recording session was repeated three times for 
each speaker, the quality of the recordings being adequate for the purposes of 
the analysis. The process resulted in a corpus of 1680 words that were subject to 
acoustic analysis with the software PRAAT (Boersma & Weenink 2011). 


? The time between slides was introduced in order to exclude coarticulation effects, 
especially for context #rV. Frame sentences were not used for the same reason. 


Another look at the structure of [r] 


2.3 Results 

‘The realization of /r/ was that of one constriction with the accompanying salient 
vocalic element in 86.6696 of the tokens for contexts #rV, Cr, rC (1470 words). 
Examples are given in the spectrograms below. Other realizations included trills 
and approximants, but they were not included for acoustic analysis. 


00 


nora 


2067 Hij 


[ p or t o k a 1 E ] 
Figure 5 - The word /porto'kale/ ‘orange’, context rC. 
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2.4 Quality of the vocalic elements 

The vocalic elements have been reported to have qualities similar to that of [a] 
and [i] (Avram 1993; Vago & Gósy 2007; Stolarski 2011). However, they have 
also been reported to be similar to the nuclear vowels in their vicinity, albeit more 
central (Quilis 1993 cited in Schmeiser 2009; Baltazani 2009, among others). 
These studies were done on languages like Spanish and Modern Greek, which do 
not include mid or high central vowels in their inventories. It would, therefore, 
be interesting to see what happens when the nuclear vowels surrounding the 
tap are themselves central and mid or high. This is an opportunity which a 
language like Romanian provides, with its /a/ and /3i/. Could we narrow down 
the possible space of variation of the vocalic elements? 

The graphs below plot the average quality of the vocalic element in each 
word, for all participants, all three times the recording session was repeated, as 
compared to the average quality of the sustained tokens of the seven Romanian 
vowels uttered by the same speakers. 

In the three graphs, the vocalic elements (small size) match the shape of the 
nuclear vowel (large size) they have in their immediate vicinity. For example, the 
small filled triangles correspond to vocoids in sequences /#ra/, /(a)Cra/, /arC(a)/. 
Each small-sized symbol is dedicated to one word used in the experiment and its 
position on the graph represents the average formant values of the vocoid in the 
respective word, across participants and recording sessions. For context #rV, there 
were two words used per full vowel, hence two small shapes for each large one. Cr 
and rC have six words per full vowel, one for every stop consonant (/p, b, t, d, k, g/). 
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Graph 1 — Vocalic elements in context #rV. 
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Graph 3 - Vocalic elements in context Cr. 


‘The graphs show that, for all contexts, the vocalic elements tend to approach the 
quality of the full vowels they are surrounded by, but there appear to be certain 
limits to this variation. They remain mid-high, central to front? and seem to 
consistently stay away from [a], [o] and [u]. This is especially easy to observe 


* — As pointed out by an anonymous reviewer, the quality of the tap's vocoids, as shown by the graphs, raises a 
phonological question: what is their featural specification, if we are to consider that the complex acoustic 
structure is mirrored in phonology? Are the vocoids underspecified for height and backness? One way to 
approach the issue would be through statistical analysis, as suggested by the reviewer, or by phonological 
study. I leave this matter for further research. 
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when looking at what happens to the vocalic parts surrounded by [a] and [i]. 
Having these full vowels around pushes the vocalic elements to a slightly more 
front area than the central vowels are in themselves, which is a strong indication 
that the point where the vocoids cannot reach further back is near. Indeed, as 
mentioned above, the vocoids corresponding to [o] and [u] are much more front 
than these two vowels. Actually, it appears that when the full vowel around the 
tap is [o] or [u], the vocoids are very similar to, or show overlap with, [5] and 
[i]. Ihe vocoids are also much higher than [a] in all contexts, remaining mid. 
‘Though there are limits to the backness of the vocalic elements of the tap, they 
can be quite front, approaching [i] and, to a lesser extent, [e]. 

‘The graphs also show variation according to context. Cr allows the vocalic 
elements to vary and approach the quality of the surrounding vowel the most. 
‘The most front and high vocoids may be found in this context, namely those in 
the /(i)Cri/ sequences. The /(e)Cre/ words also appear to contain vocoids that 
are closer to [e] than other contexts. 

Context rC keeps the vocoids closer together than Cr. However, context 
#rV clusters them together in a tighter, mid-central area (in agreement with 
Baltazani & Nicolaidis 2013). 

Let us now see if the place of articulation of the consonant has an influence on 
the vocoid in the word in which it occurs. This is shown in Graph 4 below. 
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Graph 4 — Average vocoid by place of articulation of the C in Cr and rC. 


In Graph 4, the smaller symbols represent the average vocoid according to the 
vowel flanking the cluster Cr or rC (shape of the symbol), and according to the 
place of articulation of the C(onsonant) in the cluster. The light gray shapes 
stand for the vocoids when C is a dental stop. The dark gray shapes represent 
clusters with bilabial stops, while the black symbols are for velar stop clusters. 


Another look at the structure of [r] 


The vocalic elements in contexts Cr and rC have been averaged together for 
the same place of articulation of the C. For instance, the small light gray filled 
triangle represents the vocoids in sequences /art(a)/, /(a)tra/, /ard(a)/, /(a)dra/, 
again across participants and recording rounds. 

As Graph 4 shows, there appears to be a tendency for the vocoids in bilabial 
stop-rhotic combinations to be slightly more back, while vocalic elements in 
clusters with dental and velar stops tend to be more front. That said, the C in Cr 
and rC clusters does not seem to have a significant effect on the tap's vocoids. 
‘The influence of the full vowel flanking each cluster is clearly stronger. 


2.5 Durations 

On average, measurements show that the duration of the constricted interval 
is smaller than the duration of the vocalic element. Table 1 shows that the 
vocoid in a word-initial tap (context #rV) has the longest average duration, and 
the difference between the vocoid in this context and other contexts is quite 
significant (more than 20 ms), as reported for Greek in Baltazani & Nicolaidis 
2013. The average vocoid in Cr and rC has about the same duration, while Cr 
has the shortest constricted interval. 


AVERAGE DURATION TV rC Cr VrV 
VOCALIC ELEMENT 49,0 31,0 30,3 
CONSTRICTION 27,3 26.9 24,0 28,8 


Table 1 — Average durations (ms) of constrictions and vocalic elements. 


‘The current experiment did not control for factors like speech rate, word length 
and stress placement, which could influence the durations, so more investigation 
is needed in order to elaborate further on this issue. 


2.6 Context VrV: formant changes 

"Abrupt formant changes" have been reported during the first vowel, towards 
the constriction, when the tap is in context VrV (Baltazani & Nicolaidis 2013). 
It would be expected that this phenomenon should occur systematically if the 
tap has vocoids of its own in the intervocalic context as well. Specifically, one 
would expect the formants to change, when nearing the constricted interval, 
towards a configuration similar to that of a mid-high, central (to front) vowel, 
which is the area in which the vocoids of the tap are. 

‘This would indeed appear to be the case, as Figures 6-8 below show for the 
nonsense sequences. 
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Figure 6 — The nonsense sequence [ara]. 


The vowel [a] has a high F1 and a low F2. Figure 6 for the sequence [ara] shows 
that, near the constriction, F1 decreases and F2 increases, making the target 
configuration higher and more front than [a]. 
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Figure 7 — The nonsense sequence [iri]. 


A low F1 and a high F2 are characteristics typical of the vowel [i]. In Figure 7, 
showing the nonsense sequence [iri], a slight increase in F1 and a decrease in F2 
can be noticed, which means that, towards the constriction, formants aim for a 
vowel that is a little lower and more back than [i]. 
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Figure 8 - The nonsense sequence [uru]. 


The vowel [u] has a low F1 and a low F2, which are visible at the edges of Figure 
8. Towards the constriction, F2 increases, suggesting a vowel which is more 
front than [u]. Some tokens, such as the one in the spectrogram above, even 
exhibited a portion in which the formants are in a steady-state configuration 
near the constriction, which would support the claim that the tap has vocoids 
that are detectable in context VrV. 
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Considering Figures 6, 7 and 8 together, the formant changes suggest that, in 
the immediate vicinity of the constricted interval, formants tend to approach 
the configuration that would place the vowel in the area in which the (salient) 
vocoids of the tap cluster in other contexts^, as indicated in Graphs 1-4. In 
addition to this, Figure 8 shows a token in which the vocoids are salient even 
in VrV, as suggested by the steady-state portion of the formants immediately 
before and after the constricted interval. 


3. Phonetic conclusions 


'Ihe data from Romanian, corroborated with data from other languages, seem 
to support the hypothesis that [r] includes one vocalic element flanking each 
side of the constriction. The results of the current experiment on this sound in 
Romanian suggest that the tap's vocalic elements may vary in quality, but stay in 
the mid-high, central to front area. 

'Ihe tap's vocoids are not clearly delimited where the rhotic borders with a 
vowel because on a spectrogram they show up as a continuous vocalic sequence, 
perhaps with formant changes. One cannot tell where the vocoid of the tap 
ends and the full vowel begins. However, the vocoids of the tap become salient 
when they border with stop consonants because the stops have different 
spectral characteristics. This is why one vocalic element is salient when the tap 
has a nuclear vowel on one side: the vocoid on the other side would not be 
distinguishable from the nuclear vowel. The full structure is easy to distinguish 
only when [r] has no nuclear vowels on either side (contexts #rC, CrC, Cr#). 


An anonymous reviewer draws my attention to the fact that, in VrV, we might view the vocalic parts as 
simple transitions. While this is indeed the case, I consider that they are better viewed as parts of the tap, 
given the clearly delimited vocoids that appear in other contexts. Future research may shed more light on 
the matter, for example, by comparing the transitions in VrV to those in VdV, as the reviewer suggests. 
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4. A phonological implication 


This kind of structure, which includes vowel-like parts, may be what allows [r] 
to appear in onset and coda position, but also function as a syllabic nucleus, as is 
the case in Slavic languages like Czech and Serbo-Croatian. In these languages 
one can find entire sentences composed only of consonants (see Figure 9 below). 
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Figure 9 — The Czech tongue twister sentence Stré prst skrz krk ‘Put your finger through your 
throat', which contains only consonants. The syllabic nuclei are rhotic taps. Source of the 
sound-file: http://upload.wikimedia.org/wikipedia/commons/1/12/Prst a krk.ogg. 


/1/ may even be the locus of phonemic length and pitch distinctions in Slovak 
(length) and Serbo-Croatian (length and pitch) (Sussex & Cubberley 2006:187- 
188). If its [r] realization contains vocalic elements (see Pavlík 2008 for an 
acoustic study of /r/ in Slovak), it would be reasonable to assume that they are 
the ones bearing said distinctions’. As an example, in Serbo-Croatian there 
are minimal pairs of words distinguished only by the tone on /r/. For instance, 
bfzo (long rising) is the adjective ‘quick’, neuter singular form, while 47zo (long 
falling) is the corresponding adverb, ‘quickly’. Figrue 10 below shows the 
minimal pair uttered by a native speaker. /r/ is realized as [r], and the structure 
‘vocalic element — constriction — vocalic element’ is easily distinguishable. 


5 /r/ is not the only consonant with the ability to be a syllabic nucleus and bear length and pitch distinctions. 
In Czech /l/ can be syllabic as well, and in Slovak it can be a syllabic nucleus and carry length distinctions 
along with /r/. As is known, /l/ and /n/ are vowel-like on a spectrogram, something which I consider to be 
related to their ability to be syllabic nuclei, and carry length distinctions in the case of /l/. In fact, if the 
ability of /l/ to exhibit this behavior is linked to its vocalic character, it would be only expected for /1/ (in 
this case the tap) to be vocalic in character as well, which may be taken as an additional argument in favor 
of the 'vocoid-constriction-vocoid' structure. 
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0057133 (0.057133, Visible part 1.248627 seconds 1305760 


Figure 10 — [bí:zo] and [bí:zo], uttered by a female native speaker of Serbian. 


5. Conclusion 


‘The main focus of this paper was to establish the details of the internal phonetic 
structure of [r]. It was argued that the general structure of this sound is 
‘vocoid-constricted interval-vocoid’, which would unify the seemingly different 
realizations of the rhotic segment with one constricted interval that appear in 
different phonetic contexts. An acoustic analysis of the formant structure of the 
aforementioned vocoids in Romanian revealed them to be mid-high and central 
(to front), which agrees with and completes similar acoustic studies, done on this 
sound in other languages. Finally, I suggested that this partly vocalic structure is 
what allows the tap to be a syllable nucleus and bear phonemic length and pitch 
distinctions, as it does in languages like Slovak and Serbo-Croatian. 
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New insights into American English V+/r/ 
sequences 


María Riera & Joaquín Romero, Universitat Rovira i Virgili 


Abstract 

This paper presents an acoustic study of final V+/r/ sequences in American English 
stressed monosyllables. We provide experimental data to show the durational and spectral 
characteristics of the vowel, the consonant and the VC transition, we explain the presence of 
this transition in relation to the vowel and the consonant, and we examine the role of speaking 
rate. The results show the presence of a transitional vocalic element that varies significantly 
as a function of the vowel and speaking rate. They also show significant durational and 
spectral differences which can be interpreted as the result of VC coarticulation. 


1. Introduction 


1.1 Overview 

The study presented in this paper forms part of a wider ongoing acoustic study 
that seeks a better understanding of the phonetic and phonological nature of 
final V+/1/ and V«/r/ sequences in American English stressed monosyllables by 
investigating the VC coarticulatory processes that take place in them. On the 
one hand, the present study expands on and replicates in part previous studies 
carried out by the authors (Riera & Romero 2006, 2007; Riera et al. 2009) in an 
attempt to gain new insights into the behavior of V+/r/ sequences in particular. 
On the other hand, the present study introduces innovative aspects related to 
participants, stimuli, segmentation procedures and measurements taken: the 
number of participants has been increased, the stimuli have been modified, 
a more objective method of segmentation and boundary identification has 
been applied and consonant (i.e., /r/) measurements have been included. In 
this study we provide experimental acoustic data to show the durational and 
spectral characteristics of the vowel, the consonant and the VC transition, 
we explain the presence of this transition in relation to the vowel and the 
consonant, and we examine the role of speaking rate. 
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1.2 Previous studies 

Previous studies that have looked into V+/r/ sequences have focused on the 
schwa-like element that is often perceived in some of these sequences. Terms 
like epenthetic schwa (Warner et al. 2001), excrescent schwa (Gick & Wilson 
2001, 2006) or targetless schwa (Browman & Goldstein 1992b) might be 
used to refer to this element. According to Gick & Wilson (2001, 2006), the 
perceptual presence of this element after high front vowels can be explained as 
the result of the tongue movement required in passing through a schwa-like 
configuration. Browman & Goldstein (1992b) make reference to the influence 
exerted by neighboring segments on what they call zargez/ess schwa. Wells 
(2000) uses the term pre-r breaking! to refer to cases of schwa epenthesis in 
sequences containing high vowels, whereby monophthongs become diphthongs 
and diphthongs become triphthongs. Lavoie & Cohn (1999) state that 
monosyllables consisting of non-low tense pure vowels or diphthongs followed 
by a liquid can be pronounced with either one or two syllables. Hall (2003, 
2006) distinguishes between schwa intrusion and schwa epenthesis/insertion. 
In her view, intrusive vowels are phonologically invisible, are inserted late in the 
phonological derivation, cannot act as syllable nuclei, do not add a syllable to 
the word and do not involve the addition of a vowel segment. Moreover, they 
are not likely to occur in the most marked types of CC clusters, tend to occur 
between heterorganic consonants, copy only over sonorants or gutturals and are 
either copy vowels or neutral and schwa-like in quality. 

Riera & Romero (2006) provide an impressionistic analysis of V+/I/ and V+/r/ 
sequences by means of visual spectrographic observation in a preliminary 
descriptive study that relies on acoustic data from two speakers and considers 
the whole range of American English stressed vowels. Ihe study acknowledges 
the presence of VC transitions in some of these sequences and of a variable 
schwa-like element which is not visually detectable to the same extent in all 
of the VC transitions. It also suggests a relationship between front versus back 
versus central vowels as well as between high and tense versus non-high and lax 
ones. No acoustic measurements are taken in this study. The role of speaking rate 
is evidenced only by the fact that VC transitions are more easily discernible in 
slow tokens than in fast ones. It is concluded that the presence of the transitional 
element is the result of a dynamic phonetic process of coarticulation rather than 
of a discrete phonological rule of epenthesis/insertion. 

In experimental studies conducted by Riera & Romero (2007) and Riera et al. 
(2009), durational and spectral measurements reveal differences between the 


! Wells (2000) also uses the term pre-/l/ breaking to refer to cases of schwa epenthesis in V+/I/ sequences. 
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schwa-like element and canonical schwa? as well as variability in the schwa- 
like element as a function of both the preceding vowel and speaking rate. The 
formant values ofthis element are significantly different from those of canonical 
schwa and tend to resemble more those of the preceding vowel the faster the 
speaking rate in both the V+/1/ (Riera & Romero 2007) and the V+/r/ (Riera 
et al. 2009) sequences. The phenomenon under analysis is regarded in these 
studies as a generalized process affecting all contexts (i.e., all stressed vowels 
+ /V/ or /r/), rather than, for example, only high vowels, as has been implied 
by previous studies (Gick & Wilson 2001, 2006; Lavoie & Cohn 1999; Riera 
& Romero 2006; Wells 2000). As in Riera & Romero (2006), coarticulation, 
rather than epenthesis/insertion, is favored. The segmentation procedure in 
these studies is based solely on the observation of acoustic waveforms and 
spectrograms as well as on the auditory corroboration by the experimenters. 
‘These studies rely on acoustic data from only one (Riera & Romero 2007) or 
two (Riera et al. 2009) speakers. Durational, F1 and F2 measurements are 
obtained in both studies, but F3 measurements are obtained only for the V«/r/ 
sequences. Measurements for the vowel and the transitional element only are 
obtained in both studies; neither of them includes consonant (i.e., /1/ or /r/) 
measurements and thus the behavior of the transitional element is explained 
only in terms of its relationship with the preceding vowel. Speaking rate (i.e., 
slow vs. fast) differences are considered in both studies. 'Thus, the current study 
expands on these previous findings by offering a more reliable methodological 
approach to segmentation and by providing data for the consonants as well as 
for a larger pool of subjects. Also, it provides measurements of the different 
parts of the sequences taken at midpoint rather than mean measurements of 
them, which was the measurement procedure used in previous studies. 


1.3 The present study: Objectives and hypotheses 

As mentioned above, the overall main objective of this study is to investigate the 
VC coarticulatory processes that take place in final V+/1/ sequences in American 
English stressed monosyllables. In order to do this, (i) we provide experimental 
data to show the durational and spectral (ie, F1, F2 & F3) characteristics of 
the vowel, the consonant and the transitional vocalic (i.e., schwa-like) element, 
(ii) we explain the presence of this element in relation to the vowel and the 
consonant, and (iii) we determine the role of speaking rate (iiia) by looking 
for durational, F1, F2 and F3 variability in the vowel, the transitional element 


? Canonical schwa refers to a lexically-licensed vowel that shows relatively stable spectral characteristics and 


is not usually subject to significant contextual variability, as in the first syllable of the word ahead. 
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and the consonant, and (iiib) by comparing F1, F2 and F3 mean values in the 
different contexts (i.e., each of the V+/1/ sequences) and the different rates (i.e., 
slow and fast). 

The results of this study are expected to provide evidence for the existence of 
coarticulatory processes and to make manifest the extent of VC coarticulation 
in the V+/r/ sequences under study. By looking into the behavior of the vowel, 
the transitional element and the consonant in the sequences, and by looking 
at the influence exerted by both the vowel on the transitional element and the 
transitional element on the consonant, the phonetic, rather than phonological, 
nature of the transitional element will be revealed. 

We hypothesize (i) that there will be significant durational, F1, F2 and F3 
variability in the vowel, the transitional element and the consonant, across 
contexts, and as a function of speaking rate, and (ii) that the F1, F2 and F3 
mean values of the different contexts will tend to resemble each other more 
in the slow-rate productions than in the fast-rate ones. This is expected to 
be especially the case for the vowel and the transitional element but not so 
much for the consonant. The greatest differences are expected to be particularly 
noticeable for F1 and F2 but less so for F3. 

‘The hypotheses presented here regarding the coarticulatory nature of the 
V+/r/ transitions are in accordance with the approach to speech production 
and gestural organization illustrated by the theory of Articulatory Phonology 
(Browman & Goldstein 1986, 1989, 19902, 1990b, 19922; Goldstein & Fowler 
2003). Articulatory Phonology offers a view of phonological organization 
based on articulatory gestures as primitive units that are responsible for both 
phonological invariance and phonetic variability and thus bridges the gap 
between the two levels of description. A key aspect of the theory for our study 
is the fact that it contemplates time as an intrinsic part of the description 
of gestures, therefore providing a much more plausible explanation for the 
coarticulatory variability caused by rate differences, than would be given by a 
theory based on discrete underlying segments. 


2. Method 


2.1 Speakers 

'Ihe subjects that participated in the experiment were six native speakers 
of American English. Four were male and two female. ‘They all had rhotic 
accents. Three had a western accent (California, Utah, Wyoming), one a mid- 
western accent (Wisconsin) and one an upper-southern accent (Tennessee). 
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‘The last speaker reported having lived in different parts of the US but self- 
identified her accent as being mid-western. Their ages ranged from 24 to 40. 
Four of them were temporarily living in Spain for a period of at least one 
year at the time of the recording; two had been living in Spain for over five 
years. Only one speaker had some specialized phonetic training; the rest had 
none. All the speakers were unaware of the purposes of the experiment prior 
to being recorded. Sex, type of accent, age, place of residency, contact with or 
knowledge of the Spanish language, and specialized phonetic training were 
not considered relevant factors to affect the purposes of our experiment in any 
negative way. 


2.2 Stimuli 

The target words that were selected for the experiment reported in this paper 
were seven English monosyllables containing final V+/1/ sequences (i.e., fear, 
fair, par, pore, poor, hire and power). Fifteen English monosyllables containing 
final V+/1/ sequences (i.e, feel, bill, pale, fell, pal, Poll, Paul, hole, pull, pool, hull, 
furl, pile, howl and boil) were also included as target words to be separately 
analyzed as part of the wider ongoing study. Fifteen distracters, consisting of 
C41VC»5, where C2 was one of /t/ or /d/, were included as well. These were the 
words heat, fit, hate, vet, fat, bot, fought, vote, hood, food, hut, heard, hide, void and 
vowed. All the target words and distracters were inserted in the carrier sentence 
Say ___ for me again. In order to minimize unwanted coarticulatory effects, C1 
was a non-lingual (unlike /r/ and /l/) and oral (like /r/ and /l/) consonant in the 
target words, the distracters and the word for. 


2.3 Data collection 

The six speakers performed two readings each of ten randomized repetitions of 
the carrier sentence containing the target words and distracters reported in the 
previous subsection. The first reading was performed at a slow speaking rate; the 
second at a faster one. The speaking rate variable was controlled for by presenting 
the slow-rate readings at four-second intervals separated by a three-second break 
every 20 sentences and the fast-rate readings at one-second intervals with a 
three-second break every five sentences. The readings took place in two different 
sessions separated by a 30-minute period. Each of the sessions was preceded 
by an instruction period and a trial period of 20 tokens, which were not used 
for the analysis. After the second session, the speakers were informed of the 
purposes of the experiment and were asked to fill out a questionnaire to provide 
some very general personal information relevant only to the purposes of the 
experiment. The data were recorded at a 44,100 Hz sampling rate directly into a 
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laptop computer using an M-Audio Nova condenser microphone, an M-Audio 
Firewire Solo mobile interface, and the Praat speech analysis software (Boersma 
& Weenink 2010), which was also used for the subsequent data analysis. 


2.4 Data analysis 

2.4.1 Segmentation procedure 

From a segmental point of view, the V+/r/ sequences under study are considered 
to be composed of two elements only (i.e., a vowel followed by a consonant). 
However, in order to identify the transitional element in them, the sequences 
had to be divided into three parts, corresponding to the vowel, the transitional 
element and the consonant. In the case of sequences containing diphthongs, 
they were divided into four parts and it was the second element of the diphthong 
that was taken into account for the analysis. 

Given the dynamic nature of the transitional element, and therefore the 
difficulties in identifying and delimiting it, we applied a first differentiation 
algorithm to the F1, F2 and F3 traces as identified by an automatic formant 
tracking routine in order to obtain velocity curves for each of these spectral 
events. This allowed us to automatically identify inflexion points in the formant 
traces that corresponded with the boundaries between the three portions of the 
signal under study and thus made it possible to isolate the transitional element. 
A Praat script was written to obtain these first derivative traces and identify the 
peaks of formant change given by velocity maxima and minima. ‘These peaks 
were then taken as reference points for boundary placement. 

Figure 1 illustrates the segmentation procedure. The upper part of the figure 
shows the acoustic wave and the spectrogram for the slow version of the /ir/ 
sequence in the word fear as produced by one of the speakers. In the lower 
part there are a series of tiers which provide information about F1 and F2 first 
derivative peaks of formant change indicating velocity maxima and minima 
(first and third tiers). These maxima and minima correspond to inflection points 
in the velocity trace and can, therefore, be identified with the beginning and 
end of specific events. The second tier in this lower part of Figure 1 shows the 
segmentation of the sequence into three parts (i.e., vowel, transitional schwa- 
like element and consonant). The vertical lines in this second tier are determined 
by observing where the broken lines in the F1, F2 and, if necessary F3, tiers fall, 
and then by deciding which of these lines correspond to the beginning and end 
of the different parts of the sequence. In the case exemplified here, one peak 
provided by the F2 derivative was chosen to mark the beginning of the schwa- 
like element, whereas one peak in the F1 derivative was chosen to mark its end. 
Because it was not necessary to rely on F3 derivative peaks, these are not shown 
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in the figure. 

As might be inferred from the information provided in Figure 1, problems related to 
determining boundary placement often arise. In such cases, the automated procedure 
needs to be complemented by visual observation of waveform and spectrographic 
cues as well as by auditory corroboration. This is particularly necessary in the case 
of fast tokens and sequences containing low back vowels (i.e., /a/ and /5/). The 
objective segmentation procedure can then be considered to be more reliable than 
the subjective one only to a certain extent, but nonetheless reliable enough to the 
point that it allows for consistency in the segmentation procedure. 


0.652097| 01092672 [0.944759 


ONES 


F1 derivative 


u [| I 
TT I 
F2 derivative 
LL LL pp LLL 
/ i 9 r / 
FEAR 


Figure 1 - Segmentation procedure for the /ir/ sequence corresponding to the slow version of 
the word FEAR as produced by one of the speakers. The vertical broken lines in the first and 
third textgrid tiers represent the F1 and F2 first derivative peaks of formant change indicating 
velocity maxima/minima. The second tier shows the /ir/ sequence segmented into three parts: 
vowel, transitional element and consonant. 


2.4.2 Measurements 

A Praat script was designed to extract midpoint duration, F1, F2 and F3 values 
for the vowel, the transitional element and the consonant. Mean values for each 
context (i.e, each of the V+/r/ sequences) and for each rate (i.e, slow and fast) 
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were then obtained and used for the statistical analyses. F1, F2 and F3 mean 
values were also used for comparisons between the slow and fast speaking rates. 


2.4.3 Statistical analyses 

Two-way factorial ANOVAs were performed to test for duration, F1, F2 and 
F3 overall variability in the vowel, the transitional element and the consonant. 
The independent variables were rate (i.e, slow and fast) and context (ie, each 
of the V+/1/ sequences); the dependent variables were duration, F1, F2 and F3 
mean values. 

In the cases were interactions proved to be significant, independent one way 
ANOVAs for each of the two rates (i.e., slow and fast) were subsequently 
performed to confirm the variability shown by the two-way factorial ANOVAs, 
or to test for further variability, by looking at the two rates separately. The 
independent variable was context (i.e., each of the V+/r/ sequences); the 
dependent variables were duration, F1, F2 and F3 mean values. 


3. Results 


3.1 ANOVAs for variability 

As mentioned above, the two-way factorial ANOVAs looked for duration, 
F1, F2 and F3 overall variability? in the vowel, the transitional element and 
the consonant. Rate and context were the independent variables and duration, 
F1, F2 and F3 mean values the dependent variables. Significance level was set 
at p«.01. Significant differences were obtained in almost all cases. The results 
showing variability in the vowel were significant for all speakers, for rate, context 
and the interaction between rate and context, and for duration, F1, F2 and F3. 
‘The results showing variability in the transitional element and the consonant 
were significant for all speakers, for context, and for duration, F1, F2 and F3. 
‘They were non-significant in the following cases, which involve combinations 
of rate or the interaction between rate and context, and duration, F1, F2 or 
F3: Speaker 2 (rate, F2), Speaker 3 (rate*context, duration; rate, F3), Speaker 4 
(rate, F3), and Speaker 6 (rate*context, duration; rate F1). The results showing 
variability in the consonant were non-significant in the following cases: Speaker 


? Here variability in the vowel does not refer to intra-token variability but rather to the comparison between 
the mean values for the vowels in the different contexts (ie, fear vs. hire vs. fair vs. par vs. pore vs. poor vs. 
power). As expected, the results show that these vowels are indeed different. The reason why we have 
decided to make this seemingly obvious comparison is so that it can then be compared with the differences 
in the transitional element and thus show that the transitional element retains some of the variability of the 
vowel but is much more deeply affected by the lack of a specific articulatory target, as demonstrated by the 
significant differences across rates. 
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1 (rate*context, duration; rate, F1), Speaker 2 (rate, F2; rate*context, F3), Speaker 
3 (rate*context, duration), Speaker 4 (rate*context, duration; rate*context, F1), 
Speaker 5 (rate*context, duration; rate, F2), and Speaker 6 (rate, F2). 

The results of the separate one-way ANOVAs performed to confirm the 
variability shown by the two-way factorial ANOVAs, or to test for further 
variability, with context as the independent variable and duration, F1, F2 and 
F3 as the dependent variables, also yielded significant differences in almost all 
cases. Significance level was again set at p«.01. As with the two-way factorial 
ANOVAs, the results showing variability in the vowel were significant for all 
speakers, for duration, F1, F2 and F3, and for both rates. The results showing 
variability in the transitional element were significant for five speakers, for 
duration, F1, F2 and F3, and for both rates. The exception was Speaker 4, with 
non-significant differences for duration for both rates. The results showing 
variability in the consonant were significant for two speakers, for duration, F1, 
F2 and F3 for the slow rate. They were non-significant in the following cases: 
Speaker 1 (F2, slow), Speaker 2 (F3, slow), Speaker 4 (duration, slow), and 
Speaker 5 (duration, fast; F2, slow). 


3.2 Means for comparisons between speaking rates and variability 

Figures 2,3 and 4 show scatter plots for mean F1, F2 and F3 values, respectively 
(i) for one speaker, (ii) for the seven V+/r/ contexts, (iii) for the vowel, the 
transitional element and the consonant, and (iv) for the slow-rate and fast-rate 
productions. Due to space constraints, the data from one speaker only will be 
used to exemplify what, as a general rule, applies to the other five speakers as 
well. 

As can be observed in these scatter plots, formant values for the same target 
words show clear differences across speaking rates (i.e., compare slow and fast 
fair V F1, slow and fast hire T F2 or slow and fast poor C F3). This is especially 
noticeable for F1 and F2 and less so for F3. It is also particularly discernible in 
the case of the vowel and the transitional element, but not so much in the case 
of the consonant. 

What can also be detected in these scatter plots is the fact that the difference 
between the mean values across contexts tends to be smaller in the slow-rate 
productions than in the fast-rate ones. In other words, there is greater dispersion 
between the mean values of the seven tokens in the fast rates than in the slow 
ones. Again, this can be easily seen in the case of F1 and F2 but is not easy 
to perceive in the case of F3. Likewise, it is more easily distinguishable in the 
vowel and the transitional element than in the consonant. 

‘These observations provide the grounds to state that, albeit to a different extent, 
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there is variability in the vowel, the transitional element and the consonant as 
regards F1, F2 and F3 mean values in the vowel, the transitional element and, 
to a lesser extent, the consonant, across rates. 


VFI TFI CFI 
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É 400 X e XPAR 
2 X 
g " ww XPORE 
= 300 + ƏPOOR 

200 «POWER 
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Figure 2 — Scatter plots for F1 values for one speaker by context. Each data point represents 
the mean for 10 tokens in each category. The vertical axis shows F1 frequency. The horizontal 
axis shows the values for the vowel (V), transition (T) and consonant (C) as well as the slow 
and fast rate values for each of these. 
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Figure 3 — Scatter plots for F2 values for one speaker by context. Each data point represents 
the mean for 10 tokens in each category. The vertical axis shows F2 frequency. The horizontal 
axis shows the values for the vowel (V), transition (T) and consonant (C) as well as the slow 
and fast rate values for each of these. 
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Figure 4 — Scatter plots for F3 values for one speaker by context. Each data point represents 
the mean for 10 tokens in each category. The vertical axis shows F3 frequency. The horizontal 
axis shows the values for the vowel (V), transition (T) and consonant (C) as well as the slow 
and fast rate values for each of these. 


4. Discussion and conclusions 


The purpose of this study has been to further investigate the VC coarticulatory 
processes that take place in final V+/1/ sequences in American English stressed 
monosyllables in order to contribute new insights into the behavior and nature of 
these sequences. These insights are meant to expand on the results obtained and 
conclusions reached in previous studies carried out by the same authors (Riera & 
Romero 2006, 2007; Riera et al. 2009). We have designed an experiment which 
replicates in part these previous studies but also introduces innovative aspects 
related to participants, stimuli, segmentation procedures and measurements 
taken. We have gathered acoustic data for the different constituent elements in 
the sequences that have allowed us to confirm already existing conclusions and 
reach new ones concerning the role played by speaking rate. 

The first hypothesis (ie, that there is significant durational, F1, F2 and F3 
variability in the vowel, the transitional element and the consonant, across 
contexts, and as a function of speaking rate) has been confirmed by the results 
of the statistical analyses, as well as by the information presented in Figures 2, 
3 and 4. This provides evidence for the existence of coarticulatory processes and 
shows the extent of VC coarticulation in the V+/r/ sequences which are the 
object of our study. Despite having mean duration, F1, F2 and F3 values similar 
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to those of a mid central vowel (i.e., schwa), the transitional element has been 
proven to be different in each of the different contexts (i.e, different vowels + 
/1/). This rules out the possibility of the transitional element being considered a 
segment and thus reveals its phonetic, rather than phonological, nature. 

‘The second hypothesis (i.e., that the F1, F2 and F3 mean values of the different 
contexts tend to resemble each other more in the slow-rate productions than in 
the fast-rate ones) provides evidence for the dynamic nature of the sequences, in 
general, and of the transitional element, in particular. The information provided 
in Figures 2, 3 and 4 evidences that it takes longer for the vowel to attain the 
transitional element target and for this element to attain the consonant target 
in the slow productions than in the fast ones. It shows, therefore, how an 
increase in speech rate entails a decrease in time for the articulatory gestures 
to attain their targets. All in all, it proves that we are dealing with a process 
of coarticulation rather than epenthesis/insertion. This also complements the 
findings of previous studies (Riera & Romero 2007; Riera et al. 2009) that 
reveal how the coarticulatory influence of the vowels on their corresponding 
transitional elements is shown by the fact that the spectral values of these 
elements tend to resemble more those of the preceding vowels the faster the 
speaking rate. 

Despite not varying much across V contexts, the acoustic characteristics of the 
/1/ in the different sequences show some variability, which can be taken as proof 
of the coarticulatory influence exerted by the vowel on the schwa-like element, 
by the schwa-like element on the consonant, and even by the vowel on the 
consonant. The fact that the variability is smaller in the /r/ than in the schwa- 
like element is explained by the fact that the /r/ is present underlyingly and, 
therefore, it is associated with clearly determined articulatory targets, whereas 
the schwa-like element does not correspond to any underlying segment and, 
therefore, has no specific articulatory targets. 

The present study has not aimed at finding relationships between the sequences 
according to the phonological parameters for the classification of vowels (i.e., 
vowel height or frontness/backness). A possible further study would look into 
the role played by context (i.e., each of the seven different vowels in the V+/r/ 
sequences) as well as by examining vowel-transition and transition-consonant 
differences. 

Finally, we believe that the limitations posed by an acoustic analysis of the type 
reported in this paper, based on segmentation as well as durational and spectral 
measurements, can only be overcome by an articulatory analysis of the type 
offered, for example, by the Electromagnetic Midsagittal Articulometer (EMMA) 
technique. This type of study is meant to be considered for future research. 
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/r/ in Washili Shingazidja 


Cédric Patin, Université Lille 3 


Abstract 

In this paper, the distribution of the various allophones of /r/ in the Washili variety of 
Shingazidja, a Bantu language spoken on Grande Comore, is discussed in detail. /r/ 
appears as a gril! ([r]) in absolute initial position (except before [i]) and after a consonant, 
and as a fap ([r]) in intervocalic position. Complications arise since /r/ undergoes fortition 
to [t] in some classes but undergoes lenition in initial position when the following vowel 
is low-toned. An analysis is sketched in the CVCV framework (Lowenstamm 1996; 
Scheer 2004), claiming that the [r] allophone is underlyingly a geminate. 


1. Introduction 


In this paper, I discuss in detail the distribution of the various allophones of /r/ 
(e.g. the zri//[r] and the tap [c]) in the Washili variety of Shingazidja. Shingazidja 
is a Bantu language (G44a) spoken on Grande Comore, an island belonging 
to Comoros (Shingazidja is one of the five Comorian languages). This is to 
my knowledge the first account of the distribution of rhotics in the language, 
and one of the very few discussions on rhotics in Bantu languages. A CVCV 
analysis of the distribution of the allophones of /r/ in Washili Shingazidja is 
also provided. 

One speaker of this variety, Said Mohamed (34; in France for approximately 10 
years), has been recorded (specifically for /r/) up to the present, with most of 
the recordings taking place in August 2010 and April 2011. The corpus consists 
of around 100 words and 20 phrases and sentences, each associated with several 
iterations, which were recorded twice: at Université Lille 3 (Villeneuve d'Ascq, 
France), in a closed office, with an Edirol R1 (microphone) and at ILPGA, 
Université Paris 3 (Paris, France), in an anechoic room. 

In Section 2, I will provide some background information on Shingazidja, i.e. 
its phoneme inventory and previous mentions of /r/ in the literature. In Section 
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3, the basic distribution of the different allophones of /r/ is presented. I point 
out some complications, i.e. the role of consonants and tones in the distribution, 
in Section 4. In Section 5, I will defend the hypothesis, sketched in the CVCV 
framework, that the trill is associated with two skeletal positions (while the tap 
and the [t3] allophone are associated with one skeletal position). 


2. Background 


In this section, I provide some necessary background on Shingazidja as a 
language. All the information in this section may be applied to any of the 
varieties of Shingazidja. The first subsection is dedicated to the vowels and 
prosodic system of the language, while the second subsection focuses on 
consonants. In 2.1.3, I briefly discuss previous discussions of /r/ in Shingazidja. 


2.1 Vowel inventory and the prosodic system 
Shingazidja has a classic 5-vowel system: 


a 


‘There are also nasal vowels in some Arabic loans (1-a), mostly when the 
Arabic word contains a pharyngeal or a glottal (a phenomenon known as 
‘rhinoglottophilia’, a term that comes from Matisoff 1975), or in ideophones 
(1-b). 


(1) a. áda ‘custom’ (< Ar. Padah) 
b. dhá ‘no’ 


Shingazidja has a word-group stress that falls on the penult of the phonological 
phrase. The language is also characterized by a reduced tone system (similar to 
a pitch-accent system) with complex manifestations such as unlimited shift of 
the tone — see Cassimjee & Kisseberth (1998), Patin (2007). 


2.2 Consonants 
Table 1 shows the consonant inventory of Shingazidja, following Ahmed- 
Chamanga (2010), Full (2006), Lafon (1987), Rombi & Alexandre (1982) and 


my own observations. 
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LABIALS | LABIO- | DENTALS | RETRO- | PALATALS | VELARS |GLOTTALS 
DENTALS FLEXES 
sors p b wol td ke | © 
AFFRICATES ts d f & 
IMPLOSIVES 6 d 
FRICATIVES p fv (0) (à) Sz SG) (x) (y) h 
NASALS m n n 
LATERAL l 
TRILL r 
GLIDES w j 


Table 1 — Shingazidja consonants. 


A large portion of the Shingazidja lexicon was borrowed from Arabic some 
centuries ago, and more recently but to a lesser extent from French. As a 
consequence, many consonants (namely those indicated in parentheses) 
generally surface only in (Arabic or French) loanwords. This is the case for the 
voiced labial and dental stops (2, 3) and [3] (cf. the word sandarmu'gendarme). 


(2) bwáti! ‘box’ (< Fr. boîte) 
faribó ‘charcoal (< Fr. charbon) 

(3) dúnia ‘world’ (< Ar. dunya ‘world’) 
dukutéra ‘doctor’ (< Fr. docteur) 


The interdental and velar fricatives and the glottal stop only appear in Arabic 
loanwords in formal speech (Ahmed-Chamanga 2010; Rombi & Alexandre 1982). 


(4) dahábu ‘gold’ (< Ar. dahab ‘gold’) 
xatwári ‘danger’ (< Ar. Xatar ‘danger’) 
lugá ‘tongue’ (< Ar. /ugab ‘tongue’) 


According to Rombi and Alexandre (1982), many speakers replace interdentals 
by [d], and velars by [h] (5). 


(5) hatwári ‘danger’ 
luha ‘tongue’ 


1 In this paper, underlining indicates that the tone bearing unit is lexically associated with a tone. Surface 


tones are signaled by acute accents. 
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On the other hand, the voiced implosives /b/ et /d/ (6-a, 6-b) and the retroflexes 
(6-c) essentially occur in the Bantu lexicon (but not only — there are variations 
among speakers). 


(6) a. mbwa ‘dog’ (vs. mbwá ‘ocean’) 
bá ‘misfortune’ (vs. ba ‘bench’) 
b. dada ‘wave’ 
dáo ‘plait’ 
c. hata mare ‘he spat’ 


It is not clear if prenasalized consonants in Shingazidja (i.e. /mb, mb, nt, nd, nts, 
ndz, nd, nd, nú, nds, nk, ng/) correspond to one or two phonemes. They will thus 
not be discussed in detail here, and they are not included in Table 1. 


2.3 Previous accounts of /r/ in Shingazidja 

No specific study has focused on /r/ in Shingazidja, and very few words have 
been written on the subject in studies with a broader purpose. 

All authors who mention /r/ agree on its realization as a trill (e.g. “Das Phonem 
/r/ wird realisiert als stimmhafter alveolar Vibrant ([r])” Full 2006:114). Ahmed- 
Chamanga (2010), for instance, claims that “La consonne vibrante r du comorien est 
une consonne produite avec une vibration du bout de la langue au niveau des alvéoles. 
Elle ressemble au ‘r’ de l'italien ou de l'espagnol" (Ahmed-Chamanga 2010:24). 
However, I rarely observed a clear trill realization when I worked with my 
previous informants, who came from various locations on the island. In my 
data, /r/ mostly appears as a tap. As we shall see in the following sections, the 
situation is different in Washili. 


3. Basic distribution of rhotics in Washili Shingazidja 


In this section, I examine the basic distribution of the trill [r] and tap [r] 
allophones of /r/ in Washili Shingazidja. Section 3.1 discusses the trill realization 
that is associated with the absolute initial position. Section 3.2 deals with the 
tap allophone that emerges when /r/ is placed between two vowels inside the 
prosodic word, and section 3.3 shows that the tap is also selected when the 
intervocalic /r/ occurs at a word boundary. 


3.1 Absolute initial position 
In absolute initial position, /r/ mostly appears as a trill [r] in Washili 
Shingazidja, especially before [+back] vowels (7). It is important to note that 


/r/ in Washili Shingazidja 


many of the words that exhibit an initial [r] are of Arabic origin (for instance 
ruhúsa ‘permission’ < Ar. ruxsa)?. However, this is not always the case (see some 
imperatives, where [r] appears initially: réma ‘beat!’, ruká jump). 


(7) a. before [a] [rláha ‘joy, happiness’ 
[rlággi 'color 
[r]áói "blessing" 
b. before [o] [rlóho ‘heart’ 
[r]óha ‘get out!’ 
c. before [u] [rlúnga ‘pinch!’ 
[rJuhtsa ‘permission 


Almost all the trills in my data consist of two periods of vibration. The trill 
realization of the initial /r/ is illustrated in Figure 1. 
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Figure 1 - Spectrogram of [r]áha ‘joy, happiness’. 


Before [i], however, the trill (almost) never appears. Most of the time, a tap [r] 
(sometimes an approximant [1]) is realized (8). 


(8) Before [i]  [r]iyáli ‘money’ 
[rli[c]énge ‘we took’ 
[rlínika "we gave' 


It is not clear to what extent the words in (7) are of Arabic or Bantu origin. The interdental [Ö] indicates 
that ráði ‘blessing’ most probably comes from Arabic (according to researchers such as Ahmed-Chamanga 
2010:22, interdental and velar fricatives only appear in Arabic loanwords). One reviewer has suggested that 
riga ‘pitch’ may come from the Proto-Bantu *tung- ‘to sew, thread’, or *táng ‘to tie up’; I cannot offer a 
better hypothesis. Rho ‘heart’ may correspond to the Proto-Bantu *jòjò ‘heart, life’ (Tervuren BLR3 data- 
base [http://www.africamuseum.be/collections/browsecollections/humansciences/blr]). 
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Before [e], the trill is possible (9-a), though not frequent: [e] is usually preceded 
by a tap (9-b). 


(9) Before[e] a. [r]ehéma "blessing" 
b. [rleggá ‘take!’ 
[r]edzéi ‘come back!’ 


3.2 /r/ in intervocalic position 

In intervocalic position, /r/ usually emerges as a tap [r] (10), sometimes as an 
approximant [1] (the two sounds seem to be in free distribution, perhaps partly 
depending on parameters such as the delivery or the style of speech), never as a trill. 


(10) [c]i[r]ágganya ‘we were destroyed’ 
ma[r]ávu ‘cheeks’ 
ma[rlá(g)o ‘pumpkins’ 
mi[r]á(n)da 'orange tree' 
ma[r]á(m)bo ‘insides’ 


‘The tap realization of the intervocalic /r/, which consists of a single closure, is 
illustrated in Figure 2. 
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Figure 2 - Spectrogram of ma[r]á(g)o ‘pumpkins’. 


The tap realization occurs before and after all vowels and especially, as expected, 
before front vowels (11). 
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(11) mi[r]éma ‘cultivated field’ 
malrlínd(i) ‘banana tree’ 


However, some rare items in my corpus, generally the first members of sequences 
of iterations (or when the speakers overarticulate), involve a trill in this position. 


3.3 Intervocalic position across a word boundary 

Since the /r/ appears as a trill in the absolute initial position and as a tap in 
intervocalic position, one may wonder how a word-initial /r/ would emerge. 
When /r/ appears between two vowels that belong to two different prosodic 
words belonging to the same prosodic phrase, the tap is selected (12-14). This is 
true no matter which vowels are involved, and regardless of whether the words 
are of Bantu or Arabic origin. 


(12) ( ts(i)onó [r]ah(a) )e ‘I saw joy’ 
cf. [rláha “joy, happiness’ 

(13) (tsimbá [r]ad(i) )o ‘I gave him (a) blessing’ 
cf. [r]áói "blessing" 


(14) ( (ygamniko [r]áhusa )¢ ‘I gave permission’ 
cf. [rJuhtsa ‘permission’ 


The tap realization of the intervocalic /r/ occurring at a word boundary is 
illustrated in Figure 3. 
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Figure 3 — Spectrogram of ( tsimbá [rlaö(i) )e "| gave him (a) blessing’. 
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At the beginning of a non-initial prosodic phrase, /1/ also emerges as a tap (15). 


(15) [ (nda=mí )e ( na=wé )e]: [ ([rlendgez’=[é] ndei)e] 
stab-1sg and-2g 1pl(pas). raise=the prices 
‘that’s I and you who raised the prices’ 


‘The distribution of rhotics in Washili Shingazidja thus seems to be quite simple: 
the trill realization is restricted to the absolute initial position, except before 
front vowels, while the tap (or its approximant variant) is selected in intervocalic 
position, whenever the intervocalic position occurs inside the word or between 
two words. In Section 4, I will show that the picture is a bit more complicated. 


4. Complications 

In this section, I discuss the distribution of rhotics in Washili Shingazidja as a 
function of the presence of consonants before /r/ and the absence of high tones 
on a following vowel when /r/ occurs in word-initial position. In the former 
case, discussed in Section 4.1, a trill is selected. In the latter, discussed in section 
4.2, /r/ may be realized as a tap, or even be deleted. 


4.1 After a consonant 
Washili Shingazidja clearly differs from the other Shingazidja varieties in the 
behavior of /r/ after a consonant. In this situation, /r/ emerges as a trill (16, 17).? 


(16) a. nd[r]avu ‘branch(es)’ 

nd[rlóvi *banana(s)' 

b. m-be[r]lé  n-d[r]a[r](u) ‘three rings’ 
10-ring ^ 10-three 


(17 . m[rláha 'game (specific) 
ml[rlámbuwa “you (pl) recognized’ 


In the variety of Shingazidja that is spoken in Moroni, and to a lesser extent in 
other varieties, the sequence /d/d + r/ emerges as a retroflex [d ] (‘lies’ is realized 
ndabg in Moroni, ndrabg in Washili). 


The trill realization of the /r/ occurring after a consonant is illustrated in Figure 4. 


I have no evidence for or against an analysis where [dr] synchronically corresponds to a single affricate consonant, 
a hypothesis suggested by a reviewer. This idea receives support from Proto-Bantu (WS m-drí ‘tree’ « PB *mu-ti) 
and the fact that the implosion come from post-nasal fortition — cf. n-dráru ‘cl.10-three’ vs. mi-ráru ‘cl.4-three’. 
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Figure 4 — Spectrogram of m[r]áha ‘game (specific)'. 


Interestingly, the post-consonantic rhotic also emerges as a trill when the vowel 
that follows is [1] (18). 


(18) md[r]i ‘tree’ 


Between [m] and [i], the rhotic does not appear as a single tap either, since two 
closures are clearly perceptible. /r/, in this case, is realized as [Pr]. 


(19) m[’r]ima ^ ‘African (coast)’ 
m[r]ísiza ‘you (pl) frightened’ 


In classes 3* (or 1) and 4, there is thus an alternation between [r], occurring in 
class 4 (between two vowels, the class 4 prefix being mi-, the class 2 one wa-), 
and [r], occurring in classes 1 and 3 (the prefix in classes 1 and 3 is m-) (20). 


(20)a.  m-[r]ó ‘river’ vs. mi-[r]ó ‘rivers’ 
b.  m-[r]éma ‘field’ vs. mi-[r]éma ‘fields’ 
c.  m-[r]ánda ‘orange tree’ — vs. mi-[r]ánda ‘orange trees’ 


‘These alternations also occur after the 1* and 2"4 plural prefixes of the past 
(perfective) (21). 


4 


Like the other Bantu languages (see Katamba 2003 for details), Shingazidja has a gender class system, 
where singular nouns belong to classes 1, 3, 5, 7, etc. while plural nouns belong to classes 2, 4, 6, 8, etc. 
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(21) m-[r]ámbuwa “you (pl.) — recognized’ 
vs. 
[r]i-[r]ámbuwa ‘we — recognized’ 


4.2 /r/ and tones 

In absolute initial position, when the vowel that follows /r/ is not associated with 
a high tone, there is usually no trill (there are some rare exceptions in my corpus)’. 
Most of the time, /1/ then appears as an approximant (22), and can even be deleted. 


(22) [aJahisi ‘inexpensive’ 
[1]aggá ‘from’ 
[alaili ‘as long as’ 
[1]uká jump!’ 


When the /r/ occurs in intervocalic position, however, the tone does not play a 
role — compare (23-a) to (23-b). 


(23) a. ba[r]áka "blessing" 
b. bá[r]aza ‘jury, committee’ 
5. Analysis 


5.1 Background 

Allophonic situations where a trill appears in the initial position and a tap (or 
a flap) occurs in the intervocalic position are far from rare (see Bradley 2001; 
Inouye 1995; Lindau 1985; Recasens 1991; Walsh Dickey 1997; Wiese 2001, 
2011, among many others). This is for instance the case in Romanian (Chitoran 
2001), Northern Italian (Recasens 2002) and Farsi (“In Farsi, /r/, which is a trill 
in initial position, has a tap allophone in intervocalic position and a voiceless 
trill variant in word-final position” Ladefoged & Maddieson 1996:216). 
However, such a distribution, to my knowledge, has never been identified in 
any Bantu language. Nevertheless, according to Gérard Philippson (personal 
communication), it also appears in Chaga (Tanzania), a language that possesses 
another phonemic rhotic. In Davey et al. (1982), a paper concerning liquids in 
Chaga, this distribution is not explicitly discussed. However, the figures that 
illustrate the paper seem to attest its existence. 


5 In Shingazidja, the presence of a tone may also prevent vowels from gliding and from deletion (Patin 2009). 


5 — The trill is voiceless in this position according to Majidi (1986, mentioned by Wiese 2011). 
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5.2 Analysis 

How can we account for the allophonic distribution of/r/ in Washili Shingazidja? 
I will assume that the trill realization that occurs in the absolute initial position 
corresponds to a geminate association (25). 


Q4 C V C V 


I adopt in (24) a CVCV representation. The idea behind CVCV (Lowenstamm 
1996; Scheer 2004), a theory that emerges from Government Phonology (Kaye 
et al. 1985, 1990), is that constituent structure can be reduced to a strict sequence 
of non-branching Onsets and non-branching Nuclei. 

Three arguments support the structure in (24). First, it must be noted that many 
words involving a trill in the initial position are Arabic loanwords (25). 


(25) [r]uhüása ‘permission’ (< Ar. ruxsa) 
[r]áói ‘blessing’ 


A geminate, in Arabic, results from assimilation of the article a/ when it is 
followed by ‘r’ (Classical Arabic: *al-ra?s — ar-ra?s ‘the head’, Alfozan 1989). 


(26) a[r]uh <(a)l-ruh ‘the soul" 
One could suggest that combinations of (a)/+word or the forms where a 


geminate appears were borrowed’. An argument supporting this idea is the fact 
that several French loans were borrowed with the definite article. 


Rachid Ridouane, personal communication. 


8 If this idea is borne out, one might expect similar effects to be observed at least on some of the other 
consonants that are implicated in the assimilation process that involve the article in Arabic, especially the 
other coronals (/s/, /n/, etc.). A systematic exam of the Arabic loanwords, which has not yet been conduc- 
ted, is thus necessary. Thanks to Jean-Marc Beltzung for this suggestion. 
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(27) labiéra ‘beer’ (< Fr. Za bière) 
latábu ‘table’ (« Fr. /a table) 
lagilízi ‘church’ (< Fr. l'église) 
lávani ‘vanilla’ (< Fr. /a vanille) 
lakóli ‘glue’ (< Fr. Za colle) 


Cassimjee & Kisseberth (to appear) 


The result would involve an initial CV site? in the first position of the prosodic 
word (28). 


(28) C V C V C V 


lcs 


r a h a 


Washili Shingazidja, in this respect, would be more conservative than other 
varieties, where the trill is rare in initial position. 

'Ihe second argument that supports the structure in (24) is the fact that the 
initial trill cannot be considered the fortis counterpart of the tap. Fortition 
involves a (voiceless) retroflex, which is generally associated with clear friction 
in Washili Shingazidja: [t]. In classes 5 and 6, there is an alternation between 
[c], which appears in class 6 (between two vowels, the class 4 prefix being ma-), 
and [t], which appears in class 5 (the main class 5 allomorph being @-). 


(29) a. ma-[r]índ (i) ‘banana trees (class 6)’ 
vs. 
O-[{*]ind(i) ‘banana tree (class 5)’ 
b. ma-[r]ávu ‘cheeks (class 6)’ 
vs. 
Q- [t]ávu ‘cheek (class 5Y 
c. ma-[r]óne ‘drops (class 6)’ 
vs. 
G-[tlóne ‘drop (class 5)’ 


Other alternations in this gender involve the transformation of [D] (class 6) to 
[p] (class 5) (30a), [h] (class 6) to [k] (class 5) or [1] (class 6) to [d] (class 5) 
(30b). 


° — A lexical one, distinct from that which was proposed in Lowenstamm (1999). 
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(30) a. G-[pláha “cat (class 5)’ 
vs. 
ma-[D]áha ‘cats (class 6)’ 
b. O-[d]íggo ‘back (class 5)’ 
vs. 
ma-[l]ingo ‘backs (class 6)’ 


In the CVCV framework, it will be assumed that fortition here involves an 
initial empty 'CV-' slot (a hypothesis originally discussed in Mohamed-Soyir 


2005). 

FC NE 
Od 
r a 
| [t] 
In (31), the second V slot governs and thus strengthens the first V position, 
leading to the fortition of the second consonant slot. The same configuration, 
usually referred to as coda-mirror, also explains the fortition of a post-coda 
consonant, according to Ségéral & Scheer (2001, 2005, 2008, among others). 
Gemination, if the analysis is retained, would thus differ from fortition in 
Shingazidja. 

The final argument in favor of the analysis derives from the synchronic 


alternation between casual and formal speech. Consider, for instance, the verbal 
form in (32). 


(31) C 


(32) [r]i[e]i ‘we played / we feared’ 


In casual speech, the first vowel can be dropped, leading to the realization of a 
trill (33). 


(33)a. [r]í mpi[ilá ‘we played a game’ 
b. (ndaemí)e (na-wé)e ( [eli mhu Je 


(that's) me and you who feared God’ 


The trill realization in (33-2) is illustrated in Figure 5. 
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Figure 5 - Spectrogram of ([r])[rlí mpi[r]á ‘we played a game’. 


In intervocalic position, /r/ is not able to spread through a vocalic association 
(34). Since it is not associated with two skeletal positions, it cannot emerge as 


a trill. 
G 
F 
(34) C V - C V C 
| Im | d | 
m a r a v u 


Because it is governed by the following vowel in the second version of the ‘coda- 
mirror’ representation (Scheer & Ziková 2011), the consonant cannot undergo 
fortition either. 


5.3 Remaining questions 

I propose that the trill realization that appears after a consonant results from a 
joint association to the first C slot. Such a representation is however excluded 
by the model - Scheer (2009): “[...] consonants can geminate after floating, but 
not stable consonants"). 


(35) C V C V C y 
possel T 4 el 
(n) d r o Y i 
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Double association of segments to a single C position is indeed problematic, 
since it would make the model to be too powerful. However, the idea is still 
worth considering, since it may account for situations such as the distinction 
between prenasalized consonants, where two consonant elements would be 
associated with a single slot, and N+C sequences, where the same consonant 
elements would be linked to two different slots. This hypothesis, however, needs 
deeper examination if it is to be retained. 

CVCV representations cannot account for the tap realization before [i], 
nor can they explain why a trill can emerge between a consonant and [i]. 
As for the absence of the trill before a front vowel in initial position, it 
should be noted that the trill is characterized by a contact between the tip 
of the tongue and the alveolar area. Such a configuration hardly corresponds 
to the position of the tongue through the production of [1]. Other Bantu 
languages do not allow the sequence [ri], e.g. Simakonde (Sophie Manus, 
personal communication), and several studies have discussed the (relative) 
incompatibility of trills with (high) front vowels and/or palatalization: 
among others, Hall & Hamann (2009); Kavitskaya (1997); Zygis (2005). 
Recasens (2002:346), for instance, claims that “the occasional simplification 
of the trill before a high front vowel is rather associated with the difficulty 
involved in performing two successive antagonistic tongue dorsum gestures, 
i.e. tongue dorsum lowering and retraction for [r] and tongue dorsum raising 
and fronting for [i]”. 

This explanation alone fails to account for the emergence of the trill before [i] 
when the rhotic follows another consonant. The energy provided by the closure 
may explain this distinction". 


6. Conclusions 


‘This paper is the first discussion of the allophonic variation of /r/ in the Washili 
variety of Shingazidja, a Bantu language. It is claimed that /r/ has two main 
allophones: (1) a trill [r], which occurs in the absolute initial position — whenever 
the following vowel bears a high tone — and after a consonant; and (ii) a tap [r], 
which is selected in intervocalic position, within a word and between two words. 
A CVCV analysis of this distribution has been sketched out, claiming that the 
trill allophone is underlyingly a geminate. However, the CVCV analysis fails to 


10 A reviewer has pointed out that “there is an occurrence of [d] before [i] vs. a liquid before other vowels in 


Kikongo and Sotho-Tswana, and that there is dialectal documentation showing an [r]". 


?"— Suggestion from Bernard Gautheron, personal communication. 
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account for some facts, such as the alternation of /r/ before front vowels. Further 
investigations, including a deeper articulatory analysis, are required. 
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Prosodic factors in the adaptation of Hebrew 
rhotics in loanwords from English 


Evan-Gary Cohen, Tel-Aviv University 


Abstract 

'Ihe behaviour of rhotics in (Modern) Hebrew loanwords from English differs from 
that of all other consonants. Rhotics metathesise, and words containing rhotics show 
a preference for pseudo-reduplicative structures. Within an Optimality ‘Theoretical 
framework, I argue that this unique behaviour results from the interaction among various 
universal well-formedness constraints, whose effect is unattested in native Hebrew 


grammar. This is evidence of the role of phonological universals in adult grammars. 


1. Introduction 


This paper focuses on Hebrew rhotics in loanwords from English. The 
aberrant behaviour of rhotics in adaptation, exhibiting phenomena such as 
metathesis and reduplication, is explained by appealing to the role of universal 
well-formedness constraints on syllable and word structure. Crucially, the 
application of these constraints is not supported by the native Hebrew 
grammar, and is, I argue, evidence of the role of phonological universals in 
adult grammars. 


1.1 Basic assumptions 

Grammatical principles operating in a language logically come from one of two 
sources: (a) the native grammar, or (b) universal principles (UG). 

I assume that the lexicon is divided into strata (Itó & Mester 1999) or has a 
core-periphery structure (Paradis & LaCharité 1997). Such a structure allows 
for variable grammars within the lexicon. There are productive principles in 
the lexicon’s periphery (e.g. loanwords, acronyms) which might not apply 
systematically to the native lexicon (Kenstowicz 2003; Shinohara 2004; Berent 
et al. 2009; Cohen 2011 inter alia). This may be evidence that we can and do 
access UG when the effects of L1 grammar are weakened. 'Ihe emergence of 
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such universal principles in the lexicon’s periphery is known as The Emergence 
of the Unmarked (TETU, McCarthy & Prince 1994). 


1.2 Goals 

'Ihe goal of this paper is twofold. First, I demonstrate that the non-native 
metathesis and reduplication of Hebrew rhotics in loanwords is systematic, 
ie. subject to a grammatical system. Second, via the analysis of prosodic 
phenomena involving rhotics, I support an approach advocating the universal 
motivation of the rhotics' behaviour. UG may apply even in what appear to be 
stable grammatical systems, especially in the lexical periphery of such systems. 


1.3 Structure of paper 

In 82, an overview of metathesis and reduplication in Hebrew is provided. This 
is followed in $3 by data displaying the behaviour of rhotics in loanwords. A 
formal analysis within an Optimality Theoretical framework in the subsequent 
$4 is followed by concluding remarks in $5. 


2. Metathesis and reduplication in the native Hebrew lexicon 


The following section is an overview of the native Hebrew grammar, particularly 
with respect to metathesis and reduplication. I argue that the behaviour of 
rhotics in loanwords cannot be supported by this native grammar. 


2.1 General facts about Hebrew 

2.1.1 Rhotics 

‘There is one rhotic in the native Hebrew inventory, [x] (henceforth: x), a uvular 
approximant with certain frication (Bolozky & Kreitman 2007). The precise 
manner of articulation is usually determined by prosody, with onsets displaying 
more frication. 
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2.1.2 Syllable structure 
Native Hebrew words have the following syllable structures 


[9 INITIAL MEDIAL FINAL 
CV| la'kax ‘he took’ ka.ta.va article ka.sa "happened" 
CVC | /ful.xan ‘table’ hit.kav.bel ‘snuggled’ fad.xan stapler’ 
V| abiv knight’ no.a.lim Vocking’ ka'vua permanent’ 
VC| ofino'im ‘motorbikes ne.ez.vu ‘were left’ bo.ef skunk 
CCV| thu.fa period’ --- -—- 
CVCC --- --- ha.tavt ‘you (fem.) 
wrote’ 


Table 1 — Syllable structure in native Hebrew words. 


Complex margins are noticeably rare in native Hebrew words, with complex 
onsets appearing only word initially, and complex codas appearing only word 
finally in 2*4 person feminine singular past. All complex edges respect the 
Sonority Sequencing Generalisation (SSG; Steriade 1982) allowing sonority 
rises and plateaus towards the vocalic nuclei, but never sonority falls (Bolozky 
1978; Bat-El 1994). 

Loanwords, however, have a richer syllabic inventory (Bat-El 1994; Schwarzwald 
2002). Tri-consonantal sequences may appear if they respect the SSG and do 
not have sonorant clusters (e.g. stwuk.¢u.wa ‘structure’, tekst ‘text’). 


2.2 Reduplication in Hebrew 

There is productive morphologically-motivated reduplication in the Hebrew 
lexicon (Bat-El 2006). 

First of all, reduplication is invoked in template (binyan) satisfaction. All verbs 
in Hebrew are subject to templatic restrictions imposed by one of the binyanim. 
Novel verbs are almost invariably formed within the pie/ template, a disyllabic 
binyan with a XiXeX vocalic pattern (e.g. sad ‘a side’ > tsided ‘to side with’; daf 
‘a page’ > difdef ‘to page through’ (Bat-El 1994; McCarthy & Prince 1995; 
Gafos 1998; Ussishkin 2000). 

In addition to template satisfaction, reduplication is a means of lexical expansion, 
the addition of new lexical items which are semantically similar to existing 
items (e.g. ifeg ‘to confirm vs. ijken ‘to ratify’). 

Finally, diminutives in nominals may be formed by reduplication (e.g. dag/dagig 
‘fish/small fish’; xazis/xazaszis ‘pig/piglet’; kaxol/kxalxal ‘blue/bluish’). 
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'To sum up, reduplication is a derivational process in Hebrew, a strategy of stem 
formation whose purpose is to form different yet semantically related words. 
It is (almost invariably) at the word's right edge, and invariably forms prosodic 
structures already available as unreduplicated forms, unmarked (C)VC syllables, 
avoiding the creation of clusters (see discussion in §2.1.2 and $4.3 regarding 
clusters). 


2.3 Metathesis in Hebrew 

In Hebrew, metathesis does not occur systematically, except for a single instance, 
strident-initial stems after the it- prefix of the Aitpa'el binyan (Schwarzwald 
2002): 


UNDERLYING SURFACE CF. 
/ hit-sakek/ ‘he combed’ [histawvek| /hit-wasek/ ‘he crashed’ > 
[hitwasek| 
/hit-fapes! ‘he improved’ [Aiftaper| /hit-pafes! ‘he compromised’ ^ 
[Aitpafes] 


Table 2 — Strident metathesis in hitpa'el. 


The stem-initial strident metathesises with the prefix-final 7. This process is 
restricted to stem-initial stridents in Aitpa’el. 

An additional case in which sporadic cases of metathesis are found in Hebrew 
is during acquisition, where universal principles are known to surface (Berent et 
al. 2009), often overriding native grammar. In some cases, metathesis is found, 
specifically to avoid complex codas, preferring complex onsets to them. For 
example, the adult forms disk ‘disk’ and ‘ost ‘toast’ may be produced as sdik and 
stot respectively. 


2.4 Summary 

Both reduplication and metathesis do occur in Hebrew. However, reduplication 
is morphologically restricted to lexicon expansion. Metathesis is not only 
morphologically restricted to the hitpael binyan, but is also segmentally 
restricted to stridents. Neither of the two processes is segmentally restricted to 
or unique to rhotics. 
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3. Rhotics in Adaptation 


3.1 Segmental adaptation 

The segmental adaptation of rhotics, remarkably straightforward, is not 
relevant to the discussion of metathesis and reduplication. Cohen (2010) 
presents a 1383-word Hebrew loanword corpus, constructed from three 
different sources: elicitation from native speakers, spontaneous productions and 
previous publications on Hebrew loanwords. In this corpus, English rhotics are 
invariably adapted into Hebrew as the native rhotic x. Note, many of the words 
in the corpus do not originate in English, however, they entered Hebrew via 
English mediation. ‘Therefore, I generally refer to the words as loanwords from 
English. Word-final rhotics from non-rhotic English dialects with no input 
surface rhotic also surface as # (e.g. British English afto ‘after (military term)’ 
> Hebrew after). The similarity-based phoneme mapping in the adaptation of 
rhotics into Hebrew has multiple sources in English, both perceptual (Lindau 
1985; Ladefoged & Maddieson 1996; Magnuson 2007) and orthographic 
(Vendelin & Peperkamp 2006; Escudero et al. 2008), which may even provide 
conflicting evidence (Smith 2005; Cohen 2010:137). 


3.2 Prosodic phenomena in adaptation 

Prosodic, rather than segmental, phenomena, restricted to rhotics, are at the focal 
point of this paper. In the realm of consonant adaptation in Hebrew loanwords, 
the behaviour of the rhotics is unique, as other consonants are ordinarily adapted 
1-to-1 to the closest native category, with no prosodic modification. 

The only instances in which there is some prosodic modification are: (a) deletion, 
to avoid complex syllable margins (e.g. 1gzos£ ‘exhaust’ > egzoz ; hendbaerks 
‘handbreak’ > ambseks) and (b) epenthesis, to avoid certain complex codas 
(e.g. film ‘film’ > filim). 

‘There are also two additional prosodic phenomena, both of which are unique 
in adaptation to rhotics: (a) neither is native to Hebrew grammar and (b) both 
are optional (variation among speakers and lexical items), but when they occur, 
they occur systematically. In addition to the corpus mentioned in $3.1, most 
of the examples in this paper were collected from speaker productions, both in 
conversation and in the media. 
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3.2.1 Metathesis 
g is metathesised from coda into onset position: 


NORMATIVE COLLOQUIAL Gross 
kounfleks kwonfleks cornflakes’ 
endowfinim endgofinim endorphins’ 
fabewse fabweze ‘Fabergé (egg)’ 
gerber grebeg Gerber” 
gorgonzola gwogonzola gorgonzola’ 
lunapark lunapgak Lunapark 
maskarpone maskwapone ‘mascarpone’ 
oligarx oliggax oligarch’ 
perfektsijonizem prefektsijonizem perfectionism 
perfosmer prefosmer performer 
perfumerija prefumerija perfumery’ 
perspektiva prespektiva perspective 
portret protret portrait 
gepestuag Kepketuas repertoire’ 
supesfasm superfsam Super Pharm” 


Table 3 — -metathesis in Hebrew loanwords from English. 


Note, as mentioned in $3.1, some of the above words do not originate in English 
(e.g. gorgonzola, maskarpone), however, they entered Hebrew via the English 
form, rather than directly from the source language (in these cases, Italian). The 


process is optional in colloquial Hebrew. 


3.2.2 Reduplication 
x is metathesised from onset into coda position, creating a reduplication-like 
structure (Zuraw 2002). Henceforth, I will refer to these cases as pseudo- 


reduplication!: 
NORMATIVE COLLOQUIAL Gross 
integpgetatsija integpegtatsija interpretation 
propostsija pospoxtsija proportion’ 


Table 4 — Pseudo-reduplication in forms with rhotics. 


1 


Jfofewet ‘telephone receiver’ > fosfewet. 


In a single instance in loanwords, there is even s-epenthesis, which creates a pseudo-reduplicated form: 
dioudasant ‘deodorant’ > dowdoxwant. A similar process is found in very few native Hebrew words, such as 
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While all the above forms in Tables 3 and 4 optionally undergo the processes 
mentioned, there are several forms where nothing happens. I do not account for 
this variation (or lack thereof) in this paper. The following data are some words 
in which none of the processes under discussion occurs: 


NORMATIVE/COLLOQUIAL UNATTESTED Gross 

fowmalin “feomalin formalin 

fowmaika Niomaika formica’ 

gorme "guome gourmet? 

portselan "protselan porcelain 

postabelo *“pwotabelo portabello (mushroom)’ 
tostija "trotija tortilla’ 


Table 5 — Non-varying forms. 


4. Analysis 


4.1 Theoretical background 

The underlying segmental representation undergoes modification resulting 
from constraint interaction (Optimality Theory OT, Prince & Smolensky 
1993/2004). Two types of constraint interact with one another: (a) faithfulness 
constraints requiring input-output correspondence and input string sequences 
to be preserved, e.g. constraints militating against deletion or metathesis, and 
(b) markedness constraints requiring surface forms to comply with universal 
preferences, which may force metathesis and pseudo-reduplication. 

In OT, each given input has several possible outputs. Each possible output is 
evaluated by the language's grammar, which defines how bad each candidate is 
(there are no good candidates because no candidate is perfect, as all violate some 
constraint). [he least bad candidate in a given set of candidates is the most harmonic 
candidate, the optimal candidate. The possible outputs are evaluated within a 
constraint based system in which the constraints are ranked and candidates are 
eliminated by evaluating them against the highest ranked constraint and then the 
other constraints in descending order until all but one candidate are eliminated. 
The remaining candidate is the optimal one and the selected output. 

In addition to an OT grammar, I also assume that the lexicon is stratified (81.1). 
Faithfulness constraints relevant to loanwords (and so indexed) may be ranked 
differently with respect to markedness constraints than faithfulness constraints 
pertaining to native words (Itó & Mester 1999). 
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4.2 Metathesis 


Why should metathesis occur at all? The English rhotic has long-range acoustic 
resonances (Kelly & Local 1986; Hall 2009). These result in rhotics being 
perceived where they are not actually present, and epenthesised due to this 
‘misperception (e.g. /subit > f34bsat Sherbet’). Because of this resonance, listeners 
perceive the rhotics but are not necessarily aware of their input string position. 
Generally speaking, the Hebrew rhotic x is acoustically ‘weak’. Specifically, it 
is even weaker in coda position, more so than other consonants. Therefore, it 
‘favours’ syllable onsets, which are more perceptible than the codas. This being 
said, rhotics perceived in whatever position preferably surface in the onset 
position, if possible. 

‘These observations can be translated into a formal OT grammar. The rhotic 
in Hebrew loanwords from English is assigned a prosodic position which 
facilitates its optimal perception in Hebrew, driven by perceptibility constraints 
based on models such as in Steriade's (2001/2008) P-map. I assume the input 
to the grammar is the form as produced in English. I propose the constraint 
*Copa-r: 


*Copa-r: Rhotics are avoided in coda position. 


Note, although this constraint may be perceptually motivated, it seems to 
contradict the general notion that liquids are good codas in languages, as codas 
tend to be as sonorous as possible. Further cross-linguistic evidence for the 
proposed constraint is presented in §5. 

In addition, there are faithfulness constraints militating against deletion or the 
change in the linear order of the input segments, such as Max and LINEARITY vw: 


Max: All input segments have correspondents in the output (i.e. 
don't delete segments). 


LINEARITYy/t: Preserve linear order of input segments in native(N)/ 
loanwords(LW ) respectively (i.e. no metathesis). 


Recall the metathesised forms in §3.2.1 (e.g. /endosfinim/ ‘endorphins’ > 
endwofinim). In native words, the faithfulness constraints LINEARITY», and Max 
are not violated. In loanwords, ordinarily, there is no reason to violate these 
constraints either, as all segments (except #) are possible in codas. However, 
when the coda is a rhotic, the markedness constraint *Copa-r is potentially 
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violated. It is possible to avoid the violation of this markedness constraint 
by violating LINEARITY,, metathesising the coda rhotic into onset position. 
However, this creates a complex syllable onset, violating a different markedness 
constraint: 


*Cx - No complex syllable margins. 


Clearly, “Copa-r outranks "Cx (expressed as: "Copa-r»»"Cx), otherwise 
rhotics would never be metathesised out of coda position subsequently creating 
complex onsets. Ihe grammar considers three primary candidates, each violating 
different constraints? 


/endosfinim/ *Copa-r ' Max | Linearity | Cx 
, 1 
: ' 
endogfinim *l i ; 
+ + 
: i 
endofinim EU ! 
= endsofinim | 3 Hg 


Table 6 — Evaluating candidates for /endosfinim/. 


The fact that the grammar prefers endsofinim is evidence that LINEARITYw 
and *Cx are the lowest ranked of the four constraints (no evidence of ranking 
between them — hence the dotted line indicating no crucial ranking). Since the 
potential violation of "Cona-r is not satisfied by deletion, there is no evidence 
for any ranking between *Copa-r and Max. However, both of these constraints 
are clearly more highly ranked than LINEARITY, and "Cx. Note, both Max 
and LINEARITY, are highly ranked in Hebrew, but while codas, in general, are 
disfavoured, the specific constraint *Copa-r is not visible in the native lexicon, 
as it is dominated by Lineariryy. This is where the notion of the Emergence of 
the Unmarked (TETU, $1.1) comes into play. Although native Hebrew words 
allow coda rhotics, providing no evidence for their universal markedness due to 
the high-ranking LINEARITY», the unmarked structures without coda rhotics 
emerge in loanwords, as these are not subject to LINEARITY», but rather to the 
lower ranked LINEARITY. 

Since LINEARITY», preventing metathesis in native Hebrew nouns outranks 
*Copa-t, the lexicon is effectively stratified into words which metathesise coda 
rhotics (loanwords) and those which do not (native). 


? Theoretically, there are more than three candidates, but these are the most important. 
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4.3 Pseudo-reduplication 

What is the motivation for non-morphological reduplication? Reduplication 
results a string occurring twice in a single stem. Lexical representations of such 
forms are simpler than forms in which all segments are different (i.e. less lexical 
information) and the production of such forms is simpler, as it requires the 
repetition of articulatory motor sequences rather than introducing new sequences. 
This can be translated into an OT grammar. First of all, pseudo-reduplication 
(not the morphologically motivated type in $2.2) is motivated by constraints 
such as Bat-El's (2006) Copy, which requires strings to have two occurrences 
in stems, or Zuraw's (2002) Repup, requiring word-internal similar substrings. 
I adopt Zuraw’s REDUP. 


REDUP — A word must contain some substrings that are coupled. 


Here, adjacent strings with identical vowels are assigned a reduplication-like 
structure via metathesis. Recall the constraints presented in Table 6 in $4.2: 
*Copa-r Max»»LiwEARITY;, "Cx. The following tableau introduces the 
Repvp constraint. I do not consider potential candidates which violate Max, 
and therefore, ignore Max (...) in the tableau. The grammar considers three 
primary candidates, evaluating them with the relevant constraints: 


/pwopostsija/ Repup | *Copa-r | LiNEARITY;, : CX 
psopostsija | *! 

=  pkopkotsija SANG n 

Y  pospostsija "t RR 


Table 7 — Evaluating candidates for /psopostsija/. 


The grammar proposed in Table 7 simply does not work! It selects the incorrect 
propsotisja (indicated by the *) rather than the actual winner pospostisja 
(indicated by v^). We expect the unattested /psopostsija/ > propwotsija, which 
satisfies Repup and respects “Copa-r>>LINEARITYiw (i.e. avoids codas by 
metathesising the rhotic into onset position). We do not expect /psopostsija/ 
>porportsija, as although this satisfies Repup, it violates LINEARITY, as badly 
as the unattested form, additionally incurring multiple violations of “Copa-r. 
It appears that multiple violations of the low-ranked markedness constraint 
barring complex margins are considerably worse than a single violation, and 
to be avoided, even at the expense of a high-ranked constraint (in our case, 
*Copa-r). 
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This is captured by the notion of constraint conjunction (Kirchner 1996; 
Moreton & Smolensky 2002), particularly that of self-conjunction (Itó & Mester 
1998). Self-conjunction encapsulates the idea that multiple violations of a single 
constraint have a cumulative effect. While a single violation may be low-ranked 
in the overall scheme of things, multiple violations are more highly ranked. This 
is similar in effect to the notion of constraint weighting, whereby all constraints 
have values, rather than relative rankings, and values are cumulative (Pater 2009; 
Smolensky & Legendre 2006; Prince & Smolensky 1993/2004:236). I will 
not argue for either model, though adopt self-conjunction in my analysis. 'The 
following tableau demonstrates the application of the self-conjoined *Cx-*Cx 
(note, for simplicity's sake, Max and LINEARITY:w have been omitted (...) from 
the table): 


/psopostsija/ REDUP *Cx+*Cx *Copa-r "Ne 
pBopostsija "| $ 
pBopostsija * a 
k 


=  pkopoktsija 


Table 8 — Evaluating candidates for /psopostsija/ with self-conjunction. 


‘The self-conjoined *Cx+*Cx outranks “Copa-r, thereby selecting the correct 
output, proprotisja. The low-ranked *Cx has a cumulative effect expressed by the 
self-conjoined constraint. Multiple violations allow the effect of the markedness 
constraint "Cx to surface. 

‘This is an additional instance of TETU ($1.1). Although Hebrew does allow 
clusters in native words, it disfavours them, all things being equal. Specifically, 
clusters are barred in reduplicated elements, as this necessarily results in 
multiple violations of *Cx. Incidentally, cluster avoidance in Hebrew is not 
unique to reduplication but appears in other peripheral word formation such as 
acronyms (Zadok 2002), where clusters are avoided. Specifically with respect to 
reduplicated forms, crosslinguistically, reduplicants tend to be more unmarked 
than their bases, avoiding clusters when possible, even when these exist in the 
base. For example, the Klamath distributive formation, dje:mi — de-dje:m-a ‘be 
hungry’ (Steriade 1988:131). 
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5. Discussion 


'Ihe unique behaviour of rhotics in Hebrew loanwords from English can be 
explained within a formal, systematic grammar. Crucially, this grammar differs 
from the native grammar of Hebrew, i.e. there are parallel grammatical systems 
(core-periphery, lexical strata). This suggests that the grammar in certain 
(peripheral) areas of the lexicon (in our case, loanwords) may differ from the 
grammar in other strata in the lexicon. 

Since the grammar under discussion is not supported by Hebrew grammar, 
its constraints have to be universally motivated, supporting the notion that 
adults have access to UG (as they couldn't derive the grammar of rhotics from 
the ambient language). So it appears that *Copa-r, unmotivated by Hebrew 
grammar, must be part of UG. The question is whether this proposed constraint 
has any support. 

Lindau (1985) states that crosslinguistically, rhotics tend to vocalise and even 
delete in coda position (post-vocalically), implying that coda positions disfavour 
rhotics. Evidence for this is found in several languages, such as English, Dutch, 
German, Swedish and numerous other languages, where coda rhotics may be 
deleted or vocalised (e.g. Itó & Mester 2001 for German coda rhotics). 

An additional means of avoiding coda rhotics is the metathesis of rhotics over 
vowels, which is well supported via articulatory coordination with vocalic nuclei 
(Hoole et al. 2013). In Rumanian (Savu 2013), grodino ‘garden’ is evidence 
of coda-nucleus metathesis, avoiding coda rhotics. In addition, in Rumanian 
loans from Slavic, syllabic rhotics are invariably adapted as complex onsets (e.g. 
Czech [brno] ‘Brno’ ^ Rumanian [brono]).In Ladino (Judeo-Spanish), there is 
systematic rC— Cr metathesis in loanwords from other languages (e.g. Spanish 
gordo > Ladino godro ‘fat’; Spanish verde ‘green’ > Ladino vedre). 

In the Sardinian dialect of Sestu Campidanian (Bolognesi 1998), coda rhotics 
are shifted into onset position when the root is preceded by a determiner (e.g. 
‘orku ‘ogre’ vs. s:rok:u ‘the ogre’). 

The behaviour of rhotics in Hebrew loanwords from English supports an 
approach by which adults have access to universal grammatical principles, 
which surface in the lexical periphery even when these are unsupported by 
native grammars (TETU). Universal Grammar may apply even in what appear 
to be stable grammatical systems, albeit in the lexical periphery of such systems. 


5 Note, Alber (2001) does not analyse this as metathesis from coda to onset position, as Bolognesi also pro- 


vides examples of onset rhotics being metathesised into stem initial position. 


Prosodic factors in the adaptation of Hebrew rhotics in loanwords from English 


Acknowledgements 


I would like to thank the participants of the %-afics-3 conference in Bozen- 
Bolzano for their input regarding many of the ideas expressed herein. Special 
thanks to Outi Bat-El for her assistance in this research. The invaluable 
comments from two anonymous reviewers contributed considerably to the 
analyses in the paper. I take full responsibility for any shortcomings this paper 
may have. 


References 


Alber, Birgit. 2001. Maximizing first positions. In Anthony Dubach Green & 
Ruben van de Vijver (eds.), Proceedings of HILP 5, 1-19. Potsdam: University 
of Potsdam. 

Bat-El, Outi. 1994. Stem modification and cluster transfer in Modern Hebrew. 
Natural Language and Linguistic Theory 12. 571-596. 

Bat-El, Outi. 2006. Consonant identity and consonant copy: the segmental and 
prosodic structure of Hebrew reduplication. Linguistic Inquiry 37. 179-210. 

Berent, Iris, Tracy Lennertz, Paul Smolensky & Vered Vaknin-Nusbaum. 
2009. Listeners' knowledge of phonological universals: evidence from nasal 
clusters. Phonology 26. 75-108. 

Bolognesi, Roberto. 1998. 75e phonology of Campidanian Sardinian. A unitary 
account of a self-organising structure. PhD thesis, University of Amsterdam. 

Bolozky, Shmuel. 1978. Word formation strategies in the Hebrew verb system: 
denominative verbs. Afroasiatic Linguistics 5. 1-26. 

Bolozky, Shmuel & Rina Kreitman. 2007. Uvulars in Israeli Hebrew — their 
phonetic and phonological status. The National Association of Professors 
of Hebrews International Conference on Hebrew Language, Literature and 
Culture. Sydney, Australia July 2-4. 

Cohen, Evan-Gary. 2010. The role of similarity in phonology: evidence from 
loanword adaptation in Hebrew. PhD thesis, Tel-Aviv University. 

Cohen, Evan-Gary. 2011. The emergence of UG in the periphery: vowel 
harmony in Hebrew loanwords. Proceedings of the 26” meeting of the Israeli 
Association for Theoretical Linguistics. http://linguistics. huji.ac.il/IATL/26/ 
Cohen.pdf/. 

Escudero, Paola, Rachel Hayes-Harb & Holger Mitterer. 2008. Novel second- 
language words and asymmetric lexical access. Journal of Phonetics 26. 
345-360. 


203 


204 


Evan-Gary Cohen 


Gafos, Diamandis. 1998. Eliminating long-distance consonantal spreading. 
Natural Language and Linguistic Theory 16. 223—278. 

Hall, Nancy. 2009. Long distance r-dissimilation in American English. 
Unpublished manuscript, California State University of Long Beach. 
http://www.sculb.ecu/~nhall2/dissimilation_paper.pdf. 

Hoole, Philip, Marianne Poulier, Stefan Benus & Lasse Bombien. 2013. 
Articulatory coordination in obstruent-sonorant clusters and syllabic 
consonants: data and modelling. This volume. 

Ito, Junko & Armin Mester. 1998. Markedness and word structure: OCP effects 
in Japanese. Manuscript. University of California, Santa Cruz. 

Itó, Junko & Armin Mester. 1999. The structure of the phonological lexicon. 
In Tsjimura Natsuko (ed.), The Handbook of Japanese Linguistics, 62-100. 
Cambridge: Blackwell. 

Itó, Junko & Armin Mester. 2001. Structure preservation and stratal opacity 
in German. In Linda Lombardi (ed.), Segmental Phonology in Optimality 
Theory, 261-295. Cambridge: Cambridge University Press. 

Kelly, John & John Local. 1986. Long-domain resonance patterns in English. 
International Conference on Speech Input/Output: techniques and applications. 
London: Institute of Electrical Engineers, 304-309. 

Kenstowicz, Michael. 2003. The role of perception in loanword phonology. 
Studies in African Linguistics 32. 95-112. 

Kirchner, Robert. 1996. Synchronic chain shifts in Optimality Theory. Linguistic 
Inquiry 27. 341-350. 

Ladefoged, Peter & Ian Maddieson. 1996. The sounds of the world's languages. 
Oxford: Blackwell. 

Lindau, Mona. 1985. The story of /r/. In Victoria Fromkin (ed.), Phonetic 
linguistics: Essays in bonor of Peter Ladefoged, 157-168. Orlando: Academic 
Press. 

Magnuson, Thomas. 2007. Ihe story of /r/ in two vocal tracts. In Jürgen Trouvain 
& William Barry (eds.), Proceedings of the 16” International Congress of 
Phonetic Sciences, 1193-1196. 

McCarthy, John & Alan Prince. 1994. The emergence of the unmarked: 
optimality in prosodic morphology. In Mercé Gonzàlez (ed.), Proceedings of 
the Northeast Linguistics Society 24, 333-379. Amherst: Graduate Linguistic 
Student Association. 

McCarthy, John & Alan Prince. 1995. Faithfulness and reduplicative identity. 
In Jill Beckman, Laura Walsh & Suzanne Urbanczyk (eds.), University of 
Massachusetts occasional papers in linguistics 18: papers in Optimality Theory, 
249-382. Amherst: Graduate Linguistic Student Association. 


Prosodic factors in the adaptation of Hebrew rhotics in loanwords from English 


Moreton, Elliott & Paul Smolensky. 2002. Typological consequences of local 
constraint conjunction. In Line Mikkelsen & Christopher Potts (eds.), 
Proceedings of the 21" West Coast Conference on Formal Linguistics 21, 306- 
319. Cambridge: Cascadilla. 

Paradis, Carole & Darlene LaCharité. 1997. Preservation and minimality in 
loanword adaptation. Journal of Linguistics 33. 379-430. 

Pater, Joe 2009. Weighted constraints in generative linguistics. Cogni£ive Science 
33. 999-1035. 

Prince, Alan & Paul Smolensky. 1993/2004. Optimality Theory: Constraint 
interaction in generative grammar. Oxford: Basil Blackwell. [Also TR 2, 
Rutgers University Cognitive Science Center]. 

Savu, Carmen. 2013. Another look at the structure of [r]: constricted intervals 
and vocalic elements. This volume. 

Schwarzwald, Ora. 2002. Studies in Hebrew morphology. Volume III. Tel-Aviv: 
Open University (in Hebrew). 

Shinohara, Shigeko. 2004. Emergence of Universal Grammar in foreign word 
adaptations. In Rene Kager, Joe Pater & Wim Zonneveld (eds.), Constraints 
in phonological acquisition, 292-320. Cambridge: Cambridge University 
Press. 

Smith, Jennifer. 2005. Loan phonology is not all perception: Evidence from 
Japanese loan doublets. In Timothy J. Vance & Kimberly A. Jones (eds.), 
Japanese/Korean Linguistics 14, 63-74. Stanford: CSLI. 

Smolensky, Paul & Géraldine Legendre. 2006. 75e barmonic mind: From neural 
computation to Optimality-Theoretic grammar. Cambridge, MA: MIT Press. 

Steriade, Donca. 1982. Greek prosodies and the nature of syllabification. PhD 
thesis, Massachussetts Institute of Technology, N.Y: Garland Press. 

Steriade, Donca. 1988. Reduplication and syllable transfer in Sanskrit and 
elsewhere. Phonology 5. 73-155. 

Steriade, Donca. 2001/2008. The phonology of perceptibility effects: The 
P-map and its consequences for constraint organisation. In Kristin Hanson 
& Sharon Inkelas (eds), The nature of the word, 151-179. Cambridge: MIT 
Press. 

Ussishkin, Adam. 2000. The emergence of fixed prosody. PhD thesis, UCSC. 

Vendelin, Inga & Sharon Peperkamp. 2006. The influence of orthography on 
loanword adaptations. Lingua 116. 996-1007. 

Zadok, Gila. 2002. Abbreviations: A Unified Analysis of Acronym Words, 
Clippings, Clipped Compounds, and Hypocoristics. M.A. thesis, Tel-Aviv 
University. 

Zuraw, Kie. 2002. Aggressive reduplication. Phonology 19. 395-439. 


205 


Part IIl 


Language variation 
and change 


A preliminary contribution to the study of 
phonetic variation of /r/ in Italian and Italo- 
Romance 


Antonio Romano, Università degli Studi di Torino 


Abstract 

'Ihis paper aims at giving the first contribution to the phonetic description of the 
different realisations of /r/ in the present-day Italo-Romance languages spoken in 
Italy. It discusses a selection of phonetic phenomena observed in current use from a 
descriptive point of view and which have been confirmed in most cases by experimental 
evidence. 

Descriptions are based on a sample of a thousand r-realisations from different speakers 
(of different origins and with idiosyncratic phonetic properties) and are offered in terms 


of ‘narrow phonetics’. 


1. Introduction! 


In spite of the fact that the main sources of variability described in the Italian 
domain for these sounds are stylistic and — to a lesser extent — diatopic, very few 
details on them are generally given in sociolinguistic or dialectological studies. 
Behind the traditional dichotomy between an apical r vs. a uvular r (sometimes 
masked by general labels, which were used to describe quite different classes 
of sounds following the authors’ impressions) stands a typological vagueness 
which characterises not only large diffusion books, but also part of the scientific 
literature. 

A symptom of the different considerations connected with r-pronunciation is 
the disagreement on the sociolinguistic status accorded to some r-sounds in 
phonetic studies. Even when the authors agree on their articulatory description, 
different opinions on the prestige status of dialectal variants clearly reveal the 
incomplete (and, more often, non-uniform) knowledge of the geographical and 


! — This paper reproduces some contents of the communication presented at “r-atics-2: 2" International Work- 


shop on the Sociolinguistic, Phonetic and Phonological Characteristics of /r/ (Université Libre de Bruxelles, 5-7 
Dec. 2002). 
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social variability of these units within the Italian diasystem. By-passing the 
presence of a large number of interesting phenomena involving the realisations 
of /r/ and /rr/, significant emphasis is given to what is usually called erre moscia 
(limp' or ‘lifeless r’) which is a simple way — as it has been underlined by some 
phoneticians (see Canepari 1979 and Mioni 1986) — to class together, in the same 
stereotyped category, more than ten phonetically different basic articulations. 
For all these r-sounds, besides checking the supposed presence of uvular 
vibrations, we need a better articulatory description, including details about the 
place and the degree of constriction, the dynamics of vibrations (when really 
present), voicing properties, and so on. 

This paper summarises the preliminary descriptive work I prepared in view 
of research I carried out in this domain from 1998 to 2003 and whose results 
showed that, besides a dependence on general patterns of temporal organisation, 
speakers have recourse to different strategies to obtain non-apical r-sounds by 
using the acoustic (and perceptual) effects of rapid changes in frequency patterns. 


2. Rhotics' variability: functional principles, articulatory 
strategies and acoustic cues 


Rhotics are a broad class of speech sounds whose articulatory and acoustic 
properties are renowned to be particularly speaker- and language-dependent 
(Stevens 1989:40). They are basically associated with apical trills, usually 
described as the central members of this class, but an enormous variety of other 
sounds can be found in various languages and dialects. 

While phonetic modelling reveals that an efficient tongue-tip vibration depends 
on airflow, impedance, and appropriate apical control (McGowan 1992:2903; 
Widdison 1997:191), apical trills are also described as articulatory gestures with 
narrower aerodynamic requirements than other sounds (Recasens 1991; Solé 
1999). That could be a valid reason explaining why they usually undergo all 
kinds of variation and why they are interesting for sociolinguistics (Labov 1972; 
various papers in Van de Velde & Van Hout 2001). 

In the literature, trills are described as extremely fine articulations: 


"Learning to make a trill involves placing the tongue, very loosely, in exactly the right position 
so that it will be set in vibration by a current of air. [...] The problem experienced by most 


people who fail to make trills is that the blade of the tongue is too stiff" (Ladefoged 1993:169). 
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In the past decades, Barry (1997) and Catford (2001) reopened the classical 
debate on the historical evolution of r-sounds in different languages. 
Well-known case studies have been traditionally represented by French and 
German, whereas nowadays many other languages, including Italian, show 
interesting social dynamics involving r-sounds. 

For standard Italian, the phonological starting assumptions are that an apico- 
alveolar phoneme /r/ contrasts in intervocalic position, following the consonant 
gemination pattern generalised in the whole system, with a geminate counterpart 
/rr/, whereas in other languages such as RP English or ‘normative’ French 
only one rhotic phoneme is synchronically acknowledged, with realisations 
respectively described as an alveolar approximant and a uvular fricative or 
approximant (or even a trill in some varieties; Demolin 2001)'. A functional 
view allows us to assume that the sounds that realise the two phonological units 
in Italian are, therefore in both cases, apical trills with a different number of 
contacts?. 

Nevertheless, trills are not just series of taps: they are quite different from taps 
in that the body of the tongue is subject to a higher degree of constraint during 
the production of a trill than of a tap (Recasens 1991; Kavitskaya 1997). 

As discussed in the present paper, in a number of Italian idiolects single rhotics 
are not trilled (a distributional analysis of /r/ and /rr/ allophones is in Canepari 
1979, 1999; see $3). Acoustic cues associated with the articulatory properties 
of these allophones have been extensively analysed for the different languages 
where they are mainly attested (e.g. Meyer-Eppler 1959; Delattre 1966, 1971; 
Ladefoged et al. 1977; Hagiwara 1995; Schiller & Mooshammer 1995; Alwan 
et al. 1997). 

With regard to their description in terms of timing, vibration frequencies and 


! — The real phonetic implementation of French rhotics is often disregarded in favour of a pretended diffusion 


of uvular trills. Ladefoged & Maddieson (1996:225) observe that "Uvular trills occur in some conservative 
varieties of Standard French and Standard German, although most speakers of these languages use uvular 
fricatives or approximants rather than trills". Results of a research (partially published in Billiez et al. 2002) 
which I presented at 7-afics-2 accounted for a fricative/approximant as the more common realisation for 
French /s/. 


? — As a general reference, see Ladefoged & Maddieson (1996:218-219): "[i]n Italian, single and geminate 
forms of most consonants contrast in intervocalic position" and that "[t]he single/geminate opposition also 
applies to trills". In repetitions of the words caro and carro by five speakers of Standard Italian, they found 
none of the intervocalic single trills to have more than two contacts while the geminate trills showed no 
fewer than three contacts and up to seven (Ladefoged & Maddieson 1996:221). 


In Romance languages, a distinction is usually made between ‘polyvibrants’ and ‘monovibrants’ without fur- 
ther defining different articulatory possibilities within these classes (this choice traditionally matches the 
two-way perceptual distinction proposed in Barry 1997:40 accounting for single-strike vs. multi-strike 7- 
sounds). In Canepari (1999:101) we may find a finer classification for monovibrant r-sounds in Italian, 
where they are distinguished in two categories: vibrati (taps) and vibratili (flaps). Previously Mioni 
(1986:45) defined taps as Zatfiti ‘beats’, and flaps as scatti ‘triggers’. Even though taps and flaps are not else- 
where generally distinguished in the literature (see Barry 1997:38), a clear distinction is proposed by Lade- 
foged & Maddieson (1996:232). 
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dynamic properties (as they appear at an acoustic insight) one can refer to 
Ladefoged & Maddieson (1996:218), observing closed and open phases in the 
order of 25 ms each’. 

Even the traditional timing for a single vibration in an intervocalic position 
is described in most varieties of Spanish and Italian as having a mean closure 
duration of about 20-25 ms (Vagges et al. 1975; Quilis 1981; Contini 1983; 
Recasens 1991)’. 

In their survey of Florentine students, Vagges et al. (1978:3) showed that 
7 speakers out of 10 realised /r/ as a monovibrant, 2 as a flap and 1 as a 
“multivibrant”®. 

Concerning spectral distinctions between uvular and apical r-sounds, an 
interesting framework is provided for a number of languages in some traditional 
acoustic approaches (see for instance Jakobson 1957; Fant 1960, 1968; Delattre 
1966, 1970)’. 


3. Italian r-sounds 


Concerning the actual status of Italian r-sounds, the literature is relatively poor. 
As partially introduced in 81, the main interest is devoted to r-variability in 
some geographical varieties and to the diffusion of defective variants known 
as r moscia to which, as far as I know, no instrumental study has been explicitly 


dedicated. 


Temporal characteristics of trills are detailed in reference to studies surveyed in Ladefoged & Maddieson 
(1996:226). Measures for the mean vibration rate for trills are in the range 26-30 Hz. Though anatomically 
very different, bilabial, apical and uvular trills vibrate at similar frequencies. Ladefoged et al. (1977) 
proposed an explanation based on the compensation of the difference between the masses involved by a 
decrease in the articulators' tenseness. 


In the examples given by Ladefoged & Maddieson (1996:231) for two tap realisations, the spectrograms show 
durations shorter than the mean duration I found in Italian single rs by about 20 ms and 25 ms (see 83). 


Mean duration and standard deviation of 25 + 18 ms are reported for intervocalic r in repetitions of one 
word. Similar values are reported by Contini (1983) in his acoustic analysis of Sardinian "constrictives à 
battements" whose realisations are single-strike, with a typical duration of 20-30 ms, or multi-strike, with 
2+5 interruptions of similar durations and an interrupted spectrum similar to the one of a central vowel 
(Contini 1983:414-415). In Vietti et al. (2009) 138 single-r postvocalic realisations are measured for speak- 
ers of 16 Italian cities in laboratory productions: a single-strike apical trill appeared in 38% of cases (a tap 
perhaps only in 696 of cases), with durations in the expected range (25 * 6 ms). Another 596 are 'smoothed' 
taps, whereas 18.396 are ‘broken’ approximants and 6.796 are regular approximants; 7% was represented by 
single-strike apical trill with longer duration (31 + 6 ms) with acoustic characteristics similar to those of a 
voiced stop. Velar, uvular and pharyngeal realisations (usually uvularised or pharyngealised alveolar taps) 
rank, mainly for northern speakers, up to 1096; a (somewhat lateralised) flap is dominant for Venice speak- 
ers (696), while vowel rhotacism and r-deletion are limited to a residual 596 of cases. 


7 See details in Ladefoged & Maddieson (1996:226-231). In Romano (forthcoming), I support a general 
view of a vowel-colouring of r-sounds ([a]-like when apical and [o]-like for back articulations) and suggest 
possible spectral dynamics for single-r variants. 


212 


A preliminary contribution to the study of phonetic variation of /r/ in Italian and Italo-Romance 


A simplified description of the phonological Italian system basically assumes a 
phoneme /t/ and its geminate counterpart /rr/ whose phonetic realisations, as 
already discussed in $1, are both apical trills with a different number of contacts. 
In Canepari's traditional finer analysis, summarised in Canepari (1999:97-102), 
the phoneme /r/ is associated to both [r] and [r] (the latter mainly occurring in 
unstressed syllables). A detailed distributional analysis is given in the following 
passage: 


"[N]ella pronuncia neutra odierna effettiva abbiamo, normalmente [r] in sillaba 
accentata: [(C/V)'rV-,'CrV-, 'Vr:C(V), V()rt] (oppure, solo come variante occasionale, 
non sistematica, e non enfatica, [r]). Mentre negli altri casi si ha [r]: ['VierV, (V/C)(,) 
1V-, Ve-, -£()C-] (oppure come variante possibile, specie per enfasi, [r]). Per /rr/ si ha: 
[ Vrz V, Ve'rV, (,)VerV, Vel eV] (oppure anche [r:r, rr], soprattutto per enfasi)” (Canepari 
1999:97-98)*. 


Using an instrumental approach, I checked the examples proposed by Canepari 
(1999:328) (raro ['ra:ro] < /'raro/ ‘rare’, parlare |parla:re] < / parlare/ ‘to speak’, 
Mario [ma:rjo] < /'ma:rjo/, carro [' kar:ro] < /karro/ ‘cart’, Enrico [en'ri:ko] < / 
en'riko/) which presented various phonotactic solutions. The speech sample 
came from the tape associated with Canepari's handbook and the speaker 
was a professional male speaker with no particularly evident regional traits. 
Waveforms and spectrograms are displayed in Fig. 1 with the help of WASP 
(1.02). 


8 


An updated source is now provided by Canepari (2005). 
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Figure 1 — Waveforms and spectrograms (obtained with the program WASP, thanks to M. 
Huckvale, UCL) showing standard Italian pronunciation for /r/ in the words (by a professional 
male speaker, see Canepari 1999): (upper row) raro 'rare', parlare 'to speak', Mario '(person's 
name)’; (lower row) carro ‘cart’, Enrico ‘(person's name)’. 


Taps appear only in the positions allowed by a phonetic reduction rule. Their 
realisation is restricted to the intervocalic unstressed position or to the 'explosive' 
phase of /rr/?. Nevertheless they may have a closing phase longer than 50 ms 
which is slightly (and suspiciously) higher than the one usually measured for 
taps in other languages (see $2). Other /r/ realisations (such as the coda /r/ in 
the first unstressed syllable of parlare and the onset of the stressed syllable of 
raro and Enrico) are not single-strike sounds and are realised with a 2+3-strike 
trill against the longer 5-strike trill realising /rr/. 


? — According to Rousselot (1913:53): “Lr double se comporte donc comme les autres consonne redoublées 


qui, doubles pour l'oreille, ne sont, au point de vue articulatoire, que des consonnes simples fortes et lon- 
gues”, but these sounds lead to a phonetic distinction: "La 1re r entendue est une r implosive ; la 2e, une r 
explosive" (ibid.). In agreement with Canepari's distributional scheme, as it is shown farther further in this 
work, this assumption for Italian, does not contrast in principles, with Inouye's (1995) generalisation of a 
phonetic length feature as for the relationship between trills and taps, which remains valid for languages 
without geminate/singleton contrasts. 


As shown by some extensive studies (Vagges et al. 1978; Ferrero et al. 1979), Italian apical r's are resistant 
to coarticulatory effects of neighboring sounds (for apical trills in general, see Lindau 1985). 
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Regional varieties of Italian follow the same distribution, with intervocalic single 
rhotics realised as single-strike sounds!!. On the basis of a number of items 
I analysed in spontaneous dialogues for different varieties (variously disposed 
to tap spreading in other phonotactic contexts, see Romano, forthcoming), I 
observed that single-strike r-sounds tend to preserve a higher tension in the 
vowel-to-consonant transition than the one usually accounted for languages 
described as tap-languages (cp. Vietti et al. 2009). 


4. (Not only) Back r-sounds in Italian 


While the term grasseyéis nowadays used in French to refer to a variety of non-apical 
sounds, the general category for Italian r-sounds differing from the vibrant sounds 
described as standard is traditionally labelled r moscia limp or lifeless r’; (see $1). 
As shown by some phoneticians (see Canepari 1979; Mioni 1986), limp or 
lifeless r-sounds are in reality quite different articulations which have been 
gathered in order to denote defective or snobbish pronunciations. 

People using a different kind of r are euphemistically said to have a French r 
(r alla francese or, simply, r francese) even when these sounds have nothing to 
do with the French r-pronunciation. Other common expressions to indicate a 
burrer are just 4a la erre ‘(s)he has the r or, in some cases, non ha la erre ‘(s)he 
does not have the 7’. In other cases the r-pronunciation is not lifeless at all (e.g. 
the case of long uvular trills) but the label r moscia is extended to them by some 
informants. On the other hand, I encountered the term of r pizzicata ‘pinched 
r which is also used in some regions with regard to this sustained but even 
‘different’ pronunciation. 

In the literature there is disagreement on the sociolinguistic status of such 
r-sounds because different opinions are expressed on the prestige status of 
idiolects which contain these sounds. This reveals an incomplete (non-uniform) 
knowledge of the geographical and social variability of this phenomenon in Italy. 
As is also remarked in Ladefoged & Maddieson (1996:226), Ladefoged et al. 
(1977) described a uvular trill appearing in Italian in a prestige dialect (but, 
since there is no clear-cut social differentiation for these sounds, idiolects must 
have been considered)". 


" Inouye (1995) demonstrated that intervocalic tapping of trills is widespread crosslinguistically (in this case 


only as realisations of a single consonant). 


Traditional dialects described as having uvular r-sounds are in northern areas (almost exclusively 
north-western dialects or in the bilingual areas in the North-East, on the boundary with German-speaking 
countries) but there is no particular reason to consider them as prestige dialects. Indeed, individual burrers 
have been identified everywhere in villages from North to South where specific burring styles are wide- 
spread and are sometimes promoted as markers of local socio-geographical identity. 
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Chambers & Trudgill (1998:191) write about a “uvular /r/ only in some 
educated speech" but even that description does not reflect the real Italian 
situation, where the usage of this kind of r-sound is still considered (as it was for 
French in the past centuries, see above) a pronunciation defect or, in some cases, 
a symptom of snobbery and affectation, more than 'education'. 

In most cases the sounds labelled as r mosce are even considered as 'pathological 
A similar position is expressed by Widdison (1997:189), who includes Italian 
back r-sounds among the cases of "deviation from the norm" (and this applies 
not only to northern Italian). 

Canepari (1999) includes them, among pronunciation defects, in a detailed 
articulatory classification (sometimes making use of finer non-IPA phonetic 
notations): 


“[C]è una certa varietà d'«erre mosce» usate in italiano per caratteristiche individuali. Ci 
sono quattro tipi uvulari sonori, rispettivamente: vibrante [...], costrittivo [...], approssimante 
[...] e vibrato [...]. [R] è il tipo normale in lingue come il francese belga, [&] in tedesco, [a] in 
francese; [n] è un suono piú debole, che può ricorrere come variante occasionale. [...] Altrove, 
comunque, possono essere piú o meno diffuse in tutte le regioni [...]. Un altro tipo piuttosto 
frequente d'« erre moscia » è l'approssimante sonoro labiodentale [v] [...] che, nella variante 


uvularizzata [e], suona rivoltantemente snobistico in italiano" (Canepari 1999:98)". 


In fact, rather than being prestige variants, different types of r mosce appear 
everywhere, even in rural areas and in lower socio-economic conditions, and are 
often considered to be a pronunciation defect. Barry (1997) remarks that the 
apical r-pronunciation is simply something that a number of speakers in any 
country just cannot produce: 


"In Italy and Spain, and Bulgaria, where trilled and/or flapped lingual «Rs» are de rigueur, 
efforts are made at primary school level to help children with problems. A good proportion 
do indeed achieve the goal, but there are always «pathological» cases which have to resort 


to e.g. a «labial R»" (Barry 1997:38). 


Referring to r moscia, the author gives very useful phonetic details when he observes that these sounds "in 
italiano di solito si accompagnano anche a una struttura sillabica caudata pit «strascicata» /'VC/ ['V:C] (in- 
vece di ['VC:])". Furthermore, a better account of the conditions in which these pronunciations appear is in 
the following passage: “Non raramente alcuni tipi d'« erre moscia » sono usati volontariamente come degli 
xenofonemi stilistici, parlando in italiano, anche se spesso i risultati sono ridicoli e insopportabili. Di solito, 
l'erre moscia dà un'impressione d'affettazione" (Canepari 1999:99-100). 
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Some concessions are made by Mioni (1986) who gives a reduced list of possible 
r-variants and writes: 


“Tutti questi foni sono possibili sostituti di /r/ in patologia anche se l'uvulare [x] è cosi 
ampiamente diffusa tra gli italiani, che ci si può domandare se debba ancora essere 
considerata come deviante" (Mioni 1986:46, n. 27). 


A more tolerant opinion is expressed by Canepari (1999): 


“[I]n alcune zone d'Italia la realizzazione pit diffusa per /r/ è uvulare [ws r], che localmente 
puó essere considerata quasi il tipo « normale », mentre l'articolazione alveolare diviene 
minoritaria; si tratta dell'Alto Adige, della Val d'Aosta e di buona parte della provincia di 
Parma” (Canepari 1999:101)™. 


However, if I were to give an estimate of the quantity of r moscia pronunciations 
in (mainly urban) northern Italy, I would probably say that surely less than 
1096 of speakers systematically resort to this kind of (various) pronunciation 
(perhaps more than 10% only in Piedmont and in the Parma province)”. 

As for the Italian back r-sounds, the origin of the irregular presence of these 
pronunciation styles is rarely investigated (Migliorini 1992:485 reports a source 
of the 17" c. referring to a French-style imitation). 

High society French models have traditionally been described as the origin ofthe 
diffusion of back r-sounds in various central and northern European languages 
(see, among others, Chambers & Trudgill 1998), but several authors quoted in 
Van de Velde & van Hout (2001), Van de Velde et al. (2013) and Sankoff & 
Blondeau (2013), claimed an older and independent origin for different areas 
(e.g. Holland and the Rhineland). The theory of the French back-r spread could 
be valid for some Italian areas but other hypotheses cannot be excluded”. 


^ For a ‘normal’ diffusion of uvular r-sounds in the area of Parma see Canepari (1999:387; also see a few 


comments at p. 381, about a possible diffusion in northern Lombard provinces, cp. Rohlfs 1966:377). A so- 
cio-phonetic survey of r-sounds in the Parma province is now presented in the first section of Felloni 
(2011). 


On the contrary, I would probably establish a definite upper threshold for French back-r pronunciation 
standing everywhere over 90%. This should give an idea of the difference between the two situations. 


Fundamental contributions have been given by Bonnard (1982) who collected elements to show that the 
back r is a creation of a high socio-economic class and dates back to a period between the 15 and 17" c. 
The change took place as a consequence of the raising of the tongue dorsum towards the velum (with or 
without flapping of the uvula). This kind of explanation is adopted in Delattre (1966:207). The French r 
shift is interpreted by this author as the consequence of a language-dependent articulatory constraint. Car- 
ton (1974:164) seems to go in the same direction accounting for an effect of "vocalic anticipation" but con- 
cludes in favour of a social explanation. Nevertheless, the same stands for Italian (or Spanish) where the 
trill is even considered articulatorily complex (Francescato 1970:75-76), and is often replaced by /l/ or uvu- 
lar sounds by some children at the first stages, but nothing stops the acquisition of the apical trill which 
progressively asserts among various allophones. 
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In the evolution of the Italian language and of Romance dialects spoken in Italy, 
a significant number of different phenomena, related to sound changes and 
derivational processes, involved rhotics. Besides the alternations inherited from 
Latin, and general properties related to liquids in Romance dialects, various 
outcomes are usually described (see Romano 2008 for details). 

In present-day Italian, according to Canepari (1999:101-102), one should take 
into account at least the following r-variants as typical realisations in some 
regions, even though some speakers may have recourse to other choices. 

A single-strike articulation is widespread in northern areas in almost all the 
contexts (even as a /rr/ realisation in conservative accents) but, in association 
with velar, uvular or pharyngeal realisations described above, Piedmont, Aosta 
Valley and part of Emilia-Romagna and Lombardy have an apical trill usually 
uvularised [£] [...] whereas in Liguria an alveolar uvularised tap [£] seems more 
frequent (see $2). 

Among the most interesting regional r-sounds there are north-eastern alveolar 
approximants and taps which are generally lateralised (and therefore they really 
sound as liquid-7s). In Venice the most common r-realisation is a postalveolar 
(somewhat retroflex) flap tending to show lateralisation (see above; cp. with 
retroflex flaps studied in Kvale & Foldvik 1995). These sounds realise /r/ in 
almost all the positions, often violating the general scheme illustrated in $3". 
Slightly different varieties of these sounds can be heard in coastal areas of 
‘Tuscany (on the Tyrrhenian coast; see Romano, forthcoming). 

In particular, I would like to emphasise that these r-variants are rarely perceived 
as marked and are usually attributed to a regional ‘accent’. These sounds could 
be described as a kind of more retracted retroflex approximant (something like a 
[1]) and occur as a realisation of /r/ in internal coda position or as the implosive 
phase of /rr/. They are particularly evident in stressed syllables in casual speech!*. 
In Sicilian and southern Calabrian, word-initial 7s traditionally undergo a 


7 — T shall transcribe these sounds with [r], [1] and [t] respectively. Canepari's definitions are often more fine- 


grained and need additional special symbols (Canepari 1999:101, 401). As far as I have been able to ob- 
serve, the voiced alveolar approximant (not lateralised) described by Canepari (1999:102), as is common in 
Apulia, is attested with some limitations around Bari and in speakers of Albanian origins (on the contrary, 
the voiced alveolar fricative tap introduced accounting for the Italian r pronunciation in northern Calabrian 
may have a wider extension in southern Italy). Other places where liquid-7's are de rigueur, as already intro- 
duced, are south-western Piedmont (with the [r], usual around Frabosa, and [1], between Pamparato and a 
wider area in the Asti province, which determine varieties of those r-sounds known as r monferrina, see Ca- 
biale 1970, and Ghia 2010). Similar sounds are typical for some conservative patois speaker from Salber- 
trand (in the Turin province) and other Alpine areas on the border (Briga Alta). Western varieties in the 
same valleys are renowned for using a different r-sound known as dental r (or, more locally, va/susina r) 
whose realisations oscillate between [6] and [z]. 


In his description of the dialect of Rossano (province of Massa), Rossi (1974:413) defines a postalveolar [r], 
but [i]-like vocalic component are highlighted in some r-transcriptions given by Rohlfs (1966) for Pisan 
and Ligurian varieties (see Giannelli, 1983; Pacini, 2004). A critical overview on palatalised rhotics is 
offered by Hamann (2002). 
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lengthening process - initial long trills are frequently realised as cacuminal (or 
retroflex) fricatives. Most of these pronunciations are also common in the speech 
of conservative speakers when they speak their regional Italian ?. Moreover, in 
the same regions, -/7- and dr- are subject to affrication, yielding to postalveolar 
stops or affricates (e.g. Sic. frenu vs. It. treno)”. 

Apical trills devoicing is also widespread in non-standard central and southern 
Italian pronunciations and is usually disregarded in the specific literature 
(examples are collected by Canepari 1999:440, 445, 447)”. 


5. Other (pretended) back r-sounds 


In spite of the common idea that r moscia is a uvular r, the most common 
defective r-sounds are labiodental approximants [v] (often velarised [e ])?. 
Similarly, pretended French 7’s in Italian speakers are nowadays uncommon in 
French. 

Northern Italian speakers using a back r do not all have recourse to the same 
kind of articulation, but use significantly different varieties. Here is a simplified 
list of the most common possibilities (also possible everywhere in Italy): 


19 


According to Canepari (1999:102), in these regions (plus Sardinia), word-initial /*r/ is replaced by /rr/. In 
Sicily and southern Calabria, this is then realised, in the more conservative accents, as a voiced alveolar or 
postalveolar fricative/approximant sometimes transcribed as [27] which is obviously neither [z] nor [3] (nor 
their weakened counterparts). Missing fundamental information on tongue sulcalisation, I usually simplify 
the transcription of these sounds, assuming postalveolar (retroflex) fricatives and approximants as basic 
sounds (for a review on retroflexion see Bhat 1974). In unpublished research carried out in 2007 I made 
several measurements on realisations of this type collected by Vito Matranga within the archive of au- 
dio-recordings avalaible in the ALS. These approximants, fricatives and affricates show different degrees of 
fronting or cacuminalisation (see Matranga 2007). 


? Note that the 7r- cluster after s- undergoes anticipatory assimilation too (-s¢r- > -ss- > -{-). The general 


phenomenon (also attested for Sallentinian varieties, see Romano 1999) is well-described in Italian phon- 
etic literature (since Millardet 1933) and a number of articulatory possibilities are specified for Calabrian 
dialects by Romito & Belluscio (1996), Sorianello & Mancuso (1998) and others (see Romano & 
Gambino 2010). 


The devoicing process is mainly attested in coda position before voiceless consonants where speakers of 
these varieties hyperarticulate r-sounds with an increase in the tension of the constriction (and slight re- 
traction of the articulation place) by producing [r] and/or [f]. 
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E.g. some Piedmontese speakers presenting the labiodental approximants [v], when not suppressing the sound, 
tend to articulate the clusters /pr-/ and /br-/, in particularly prominent positions, respectively as [B®] and [B®] 
(maybe only single-strike). That seldom happens even for Piedmontese speakers with uvular trills (similar 
sounds mark the pronunciation adopted for the Italian voice of the Warner Bros' cartoon character Roger Rab- 
bit who utters [n|/[s] in the realisation of initial pr-/br- clusters). Another example is the stereotype given by 
the actor Totó for the Neapolitan snobbish r moscia which is realised as a dental approximant (something like 
[0] or [z], see footnotes above). Finally, I shall mention here the example of a professional speaker of the re- 
gional Piedmontese TV News of the National Broadcaster RAI, who frequently lets the tip of his tongue come 
out from the mouth while speaking (occasionally showing linguo-labial contacts). This phenomenon systemat- 
ically appears during the production of the clusters -rt-, -rd-, -r/- and -rn-, all normally including apico-alve- 
olar contacts, replaced by predorso-alveolar contacts. They are probably induced by a preceding interdental ap- 
proximant gesture (something like [Ö]), which is the common r-sound for this speaker. 
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(1) speakers using a velar fricative [y] also present the unvoiced variant [x] and 
the approximant variant [uy] in the appropriate contexts (mainly the unvoiced in 
voiceless consonant context and the approximant between vowels); 

(2) speakers preferring a uvular articulation may present trilled variants [r] with 
one or more strikes (weakened forms of these sounds are fricative/approximant 
variants [v] or [s]) and unvoiced allophones in voiceless contexts ([r] / [x]; 
following the same distributional rule that could be observed in French)”; 

(3) speakers occasionally resort to less controlled post-uvular articulations (the 
same speakers of the other points above may be subject to these alternations) 
which could give rise to [€], [$], [A] and many variants, often appearing as simple 
[e]-like sounds in positions where a weakening is likely to take place (generally 
in coda) or where a reduction gives rise to vocalic glides (between vowels); 

(4) speakers presenting labialisation and/or multiple articulation places use 
many other variants for velar and uvular r-sounds (see above); 

(5) people affected by r moscia (that is a more or less velarised/uvularised labiodental 
approximants) tend to occasionally allow the back articulation to prevail or to realise 
simple wavings between vowels, sometimes even yielding to no gesture traces at all. 


6. Conclusions 


In the present study, general topics have been discussed in reference to historical 
and present-day representations of r-sounds in the Italian linguistic domain 
which are affected by quite different sociophonetic dynamics. 

In the first part of the paper, I have illustrated the normal basic realisations of 
/1/ ([e], [r] and [r:] for Italian), its distribution and phonetic reduction rules. 
In Italian, singleton vs. geminate contrasts are generalised in the phonological 
system: /r/ and /rr/ are associated with different phonetic realisations often 
reinterpreted in different regional varieties on the grounds of the underlying 
dialectal systems. Nonetheless, the main source of r-variability is in social 
preferences and in first-language acquisition difficulties. 

In the second part of the paper, I have discussed the wide range of possible 
slightly different realisations of apical rhotics and of their back variants, by 
highlighting the need for a better articulatory account (testing the presence vs. 
absence of palatalisation, lip-rounding and secondary articulations, as well as 


3 A number of other possibilities arise for speakers not respecting this ‘natural’ distribution, then generalising 


for instance [x] in all the positions or extending the allophones to both /r/ and /rr/ (by neutralising the 
contrast). I would like to draw attention to the case of a southern area (northern Apulia) where, among a 
number of speakers using [y] and [x] or [x] and [x] as common variants of pinched r, one may hear some 
people only using the voiceless variants in phonetic contexts where they are not usual, thus being distin- 
guished from the rest of the community (see Romano, forthcoming). 
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of concomitant gestures and conditioning effects on the surrounding sounds). 
Several varieties of unusual r-sounds have been surveyed, ranging from limp or 
lifeless 7s to pinched rs and liquid 7’s. 

With regard to the socio- and geo-linguistic situation, several characteristics 
have been identified. These may help to determine different kinds of r moscia on 
the grounds of the phonetic distinction proposed in the recent rhotics’ literature 
on rhotics between trilling-variants as opposed to waving-variants. 


Acknowledgements 


Part of the work was carried out during my stay in Grenoble and benefittet from 
the collaboration with Cyril Trimaille and Patricia Lambert of the LIDILEM. | 
acknowledge the people of the two laboratories where I worked during those years: 
the former Institut de la Communication Parlée (ICP) and Centre de Dialectologie de 
Grenoble (in particular Pierre Badin and Michel Contini). I am also indebted to 
the staff of the Linguistic Atlases ALEPO, ALI and ATPM (in particular Sabina 
Canobbio and Matteo Rivoira) for giving me access to (or helped me to collect) 
audio materials on Piedmontese rs. I am grateful to Manuel Barbera, Paolo 
Mairano and Marco Tomatis of the former Faculty of Foreign Languages of Turin 
and to the Synthesis team of Loguendo Technologies Ltd. for allowing me to access 
their databases of different languages and dialects. Last but not the least, I would 
like to acknowledge Hans Van de Velde, Didier Demolin, Alessandro Vietti and 
Lorenzo Spreafico for having encouraged me to keep following the thread of this 
research. I am particularly grateful to Alessandro and Lorenzo for allowing me to 
publish part of my previous unpublished work in this volume. 


References 


Alwan, Abeer, Shrikanth Narayanan & Katherine Haker. 1997. Toward articulatory- 
acoustic models for liquid approximants based on MRI and EPG data. Part II: the 
Rhotics Journal of the Acoustical Society of America 101. 1078-1089. 

Barry, William. 1997. Another R-tickle. Journal of the International Phonetic Association 
27. 35-45. 

Bhat, D.N.S. 1974. Retroflexion and retraction. Journal of Phonetics 2. 233-237. 

Billiez, Jacqueline, Karin Krief, Patricia Lambert, Antonio Romano & Cyril Trimaille. 
2002. Pratiques et représentation langagiéres de groupes de pairs en milieu urbain. 


Rapport pour l'Observatoire des pratiques linguistiques en France - DGLFLF 


221 


222 


Antonio Romano 


(Délégation Générale à la Langue Frangaise et aux Langues de France), Ministére 
de la Culture et de la Communication, manuscript. 

Bonnard, Henri. 1982. Synopsis de phonétique historique. Paris: Sedes. 

Cabiale, R. 1970. Tracce grafiche dell'oscillazione L-R in documenti medioevali del Piemonte 
meridionale. MA thesis, Università degli studi di Torino. 

Canepari, Luciano. 1979. Introduzione alla fonetica. Torino: Einaudi. 

Canepari, Luciano. 1999. MaPI - Manuale di Pronuncia Italiana. Bologna: Zanichelli. 

Canepari, Luciano. 2005. A handbook of phonetics. Natural phonetics: articulatory, auditory 
& functional. München: Lincom Europa. 

Catford, John. 2001. On Rs, rhotacism and paleophony. Journal of the Acoustical Society of 
America 31. 171-185. 

Chambers, Jack & Peter Trudgill. 1998. Dialectology. Cambridge: Cambridge University 
Press. 

Contini, Michel. 1983. Etude de géographie phonétique et de phonétique instrumentale du 
sarde. PhD thesis, Université de Strasbourg. 

Delattre, Pierre. 1966. A contribution to the history of «R grasseyé». In Pierre Delattre, 
Studies in French and comparative phonetics. Selected papers in French and English, 
206-207. Ihe Hague: Mouton (orig. publ. in Modern Language Notes 1944: 562- 
564). 

Delattre, Pierre. 1970. Des indices acoustiques aux traits pertinents. Proceedings of tbe 
6” International Congress of Phonetic Sciences, 35-47. Prague: Academia Publishing 
House of the Czechoslovak Academy of Sciences. 

Delattre, Pierre. 1971. Pharyngeal features in the consonants of Arabic, German, 
Spanish, French and American English. Phonetica 54. 93-108. 

Demolin, Didier. 2001. Some phonetic and phonological observations concerning 
/R/ in Belgian French. In Hans Van de Velde & Roeland van Hout (eds.), r-atics. 
Sociolinguistic phonetic and phonological characteristics of /r/, 63-73. Bruxelles: ILVP. 

Fant, Gunnar. 1960. Acoustic theory of speech production. The Hague: Mouton de Gruyter. 

Felloni, Maria. 2011. Prosodia sociofonetica: l'italiano parlato e percepito a Parma. Milano: 
Franco Angeli. 

Ferrero, Franco, Arturo Genre, Louis-Jean Boë & Michel Contini. 1979. Nozioni di 
fonetica acustica. Torino: Omega. 

Francescato, Grazia. 1970. I/ linguaggio infantile. Strutturazione e apprendimento. Torino: 
Einaudi. 

Ghia, Alberto. 2010. Il «malessere delle laterali» in area astigiana. MA thesis, Università 
degli studi di Torino. 

Giannelli, Luciano. 1983. Considerazioni sullo stato del rotacismo di / preconsonantico 
nell'Italia centrale. Quaderni dell'Istituto di Linguistica dell'Università di Urbino 1. 
135-154. 


A preliminary contribution to the study of phonetic variation of /r/ in Italian and Italo-Romance 


Hagiwara, Robert. 1995. Acoustic realisations of American /R/ as produced by women 
and men. UCLA Working Papers in Phonetics 90. 1-187. 

Hamann, Silke. 2002. Retroflexion and retraction revised. ZAS papers in linguistics 28. 13-25. 

Inouye, Susan. 1995. Trills, taps and stops in contrast and variation. PhD dissertation, 
University of California, Los Angeles. 

Jakobson, Roman. 1957. Mufaxxama: the ‘emphatic’ phonemes in Arabic. In Ernst 
Pulgram (ed.), Studies presented to Joshua Whatmough, 105-115. The Hague: Mouton 
de Gruyter (republ. in Selected writings. The Hague: Mouton de Gruyter, 1962, 510- 
522). 

Kavitskaya, Darya. 1997. Aerodynamic constraints on the production of palatalized trills: 
the case of the slavic trilled [r]. In George Kokkinakis, Nikos Fakotakis & Evangelos 
Dermatas (eds.), Proceedings of Eurospeech '97, 751-754. Baixas: International Speech 
Communication Association. 

Kvale, Knut & Ame Foldvik. 1995. An acoustic analysis of the retroflex flap. In Elenius 
Kjell & Peter Branderud (eds.), Proceedings of the 13" International Congress of 
Phonetic Sciences 2, 454-457. 

Labov, William. 1972. Sociolinguistic patterns. Philadelphia: Pennsylvania University 
Press. 

Ladefoged, Peter. 1993. A course in phonetics. Fort Worth, TX: Hartcourt, Brace, and 
Jovanovich. 

Ladefoged, Peter. & Ian Maddieson. 1996. The sounds of the world’s languages. Oxford: 
Blackwell. 

Ladefoged, Peter, Anne Cochran & Sandra Disner. 1977. Laterals and trills. Journal of 
the International Phonetic Association 7. 46-54. 

Lindau, Mona. 1985. The story of /r/. In Victoria Fromkin (ed.), Phonetic Linguistics: 
Essays in bonor of Peter Ladefoged, 157-168. Orlando: Academic Press. 

Matranga, Vito. 2007. Trascrivere: la rappresentazione del parlato nell esperienza dell’Atlante 
Linguistico della Sicilia (Piccola Biblioteca dell'ALS, 5). Palermo: Centro di Studi 
Filologici e Linguistici Siciliani. 

McGowan, Richard. 1992. Tongue-tip trills and vocal tract wall compliance. Journal of 
the Acoustical Society of America 91. 2903-2910. 

Meyer-Eppler, Werner. 1959. Zur Spektralstruktur der /r/-Allophone des Deutschen. 
Acustica 9. 247-250. 

Migliorini, Bruno. 1937/1992. Storia della lingua italiana. Firenze: Sansoni. 

Millardet, Georges. 1933. Sur un ancien substrat commun à la Sicile, la Corse et la 
Sardaigne. Revue de Linguistique Romane IX. 346-369. 

Mioni, Alberto. 1986. Fonetica articolatoria: descrizione e trascrizione degli atteggiamenti 
articolatori. In Lucio Croatto (ed.), Trattato di foniatria e logopedia. Aspetti fonetici 
della comunicazione vol. IL, 15-88. Padova: La Garàngola. 


223 


224 


Antonio Romano 


Pacini, Beatrice. 2004. Particularités du dialecte de Monteroni d'Arbia et d'autres 
localités de la Toscane sud-orientale. Géo/inguistique 9. 117-143. 

Recasens, Daniel. 1991. On the production characteristics of apicoalveolar taps and trills. 
Journal of Phonetics 19. 267-280. 

Rohlfs, Gerhard. 1966. Grammatica storica. dell'italiano e dei suoi dialetti. Fonetica. 
Torino: Einaudi (orig. ed., Historische Grammatik der Italienischen Sprache und ihrer 
Mundarten - Lautlehre. Bern: Francke, 1949). 

Romano, Antonio. 1999. A phonetic study of a Sallentinian variety (southern Italy). 
In John Ohala, Yoko Hasegawa, Manjari Ohala, Daniel Granville & Ashley Bailey 
(eds.), Proceedings of the XIVth International Congress of Phonetic Sciences, 1051-1054. 
Berkeley: University of California, Berkeley. 

Romano, Antonio. 2008. Inventari sonori delle lingue: elementi descrittivi di sistemi e processi 
di variazione segmentali e sovrasegmentali. Alessandria: Dell'Orso. 

Romano, Antonio (ed.) forthcoming. R+are humanum est: bystory and geography of 
r-sounds in Italy and other places in the World. 

Romano, Antonio & Francesco Gambino. 2010. Cacuminali calabresi: modi e 
luoghi d'articolazione alla luce di misurazioni acustiche e indagini per risonanza 
magnetica (IRM). In Franco Cutugno, Piero Maturi, Renata Savy, Giovanni 
Abete & Iolanda Alfano (eds.), Parlare con le macchine, parlare con le persone, 505- 
513. Torriana: EDK. 

Romito, Luciano & Giovanni Belluscio.1996.Studio elettropalato-grafico dell'opposizione 
fonematica [II], [dd], [dd] nel dialetto di Catanzaro e [4], [A], [d], [ð] nella parlata 
albanese di San Basile. Proceedings of the XXIV Convegno Nazionale dell'A.1.A.. 141-144. 

Rossi, Mario. 1974. Description phonétique et phonologique du parler de Rossano (Province 
de Massa, Italie). MA thesis, Paris. 

Rousselot, Pierre. 1913. Dictionnaire de la prononciation française (suite): r double 
-aspiration - éléments atones des diphtongues frangaises. Revue de Phonétique 3. 50-83. 

Sankoff, Gillian & Hélène Blondeau. 2013. Instability of the [r] ~ [R] alternation in 
Montreal French: An exploration of stylistic conditioning in a sound change in 
progress. This volume. 

Schiller, Niels & Christine Mooshammer. 1995.'[he character of /r/-sounds: articulatory 
evidence for different reduction processes with special reference to German. In 
Elenius Kjell & Peter Branderud (eds.), Proceedings of the 13” International Congress 
of Phonetic Sciences 3. 452-455. 

Solé, Maria-Josep. 1999. Production requirements of apical trills and assimilatory 
behaviour. In John Ohala, Yoko Hasegawa, Manjari Ohala, Daniel Granville & 
Ashley Bailey (eds.), Proceedings of tbe XIVth International Congress of Phonetic 
Sciences, 487-489. Berkeley: University of California, Berkeley. 

Sorianello, Patrizia & Antonella Mancuso. 1998. Le consonanti retroflesse del cosentino: 


A preliminary contribution to the study of phonetic variation of /r/ in Italian and Italo-Romance 


unanalisi preliminare. In Pier Marco Bertinetto & Lorenzo Cioni (eds.), Unita 
fonetiche e fonologiche: produzione e percezione, 142-154. Roma: Esagrafica. 

Stevens, Kenneth. 1989. On the quantal nature of speech. Journal of Phonetics 17. 3-45. 

Vagges, Kyriaki, Franco Ferrero, Emanuela Magno Caldognetto, & C. Lavagnoli. 1978. 
Some acoustic characteristics of Italian consonants. Journal of Italian Linguistics 3. 
69-85 (reference is made to the preprint presented at the 8” International Congress of 
Phonetic Sciences, Leeds 1975, 23 pages). 

Van de Velde, Hans & Roeland van Hout (eds.). 2001. 7-atics. Sociolinguistic phonetic and 
phonological characteristics of /r/, 27-43. Bruxelles: ILVP. 

Van de Velde, Hans, Evie Tops & Roeland van Hout. 2013. The spreading of uvular /r/. 
This volume. 

Vietti, Alessandro, Lorenzo Spreafico & Antonio Romano. 2010. Tempi e modi di 
conservazione delle /r/ italiane nei frigoriferi CLIPS. In Stephan Schmid, Michael 
Schwarzenbach & Dieter Studer (eds.), La dimensione temporale del parlato, 113-128. 
Torriana: EDK. 

Widdison, Kirk. 1997. Variability in lingual vibrants: changes in the history of /r/. 
Language and Communication 17. 187-193. 


225 


The spreading of uvular [R] in Flanders 


Hans Van de Velde’, Evie Tops? & Roeland van Hout? 
'Universiteit Utrecht 

?Université Libre de Bruxelles 

3Radboud Universiteit Nijmegen 


Abstract 

In this paper the socio-geographical distribution of alveolar and uvular /r/ in Flanders is 
researched to provide support for the idea that uvular [r] has become more wide-spread 
in Flanders in the course of the 20% century. Due to its contact history with French 
and its relationship with German dialects, the Flemish situation might provide more 
insight in the controversy around the spread of uvular [r] in Western-Europe. Three data 
sources are used for this study: two existing traditional dialect survey and a new socio- 


geographic survey based on a sociolinguistic approach. 


1. Introduction 


Although /r/ is marked by large-scale variation in the languages of the world, 
(c£. Van de Velde & van Hout 2001), the alveolar trill [r] is the prototypical 
r-sound looking at the statistics provided by Maddieson (1984:83). Uvular 
[r] is infrequent and its occurrence seems to require an extra or special 
explanation. 

‘The rise of uvular [r] in Western Europe has been debated among linguists 
since the end of the 19* century. Trautmann (1880) attributed the origin of 
uvular [n] to the Parisian elite in the 2™ half of the 17^ century and Chambers 
& Trudgill (1980:186ff) explain the presence of uvular [n] in Danish, Dutch, 
German, Norwegian and Swedish by the prestige and influence of French. 
However, it took uvular [n] centuries to become the common variant in France 
(Martinet 1985:38-39; Carton 1995:36; Tops 2009:238-246). And, the French 
connection has been contested as an explanation by a number of authors. 
Moulton (1952) and Penzl (1961) showed that varieties of German had uvular 
[n] before (Parisian) French, and Wiese (2001) argued that the developments 


in German were independent from those in French. 
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Many studies report the occurrence of uvular [n] in Flanders, the Dutch 
speaking part of Belgium, especially dialectological studies (see Section 2). In 
some areas the presence of uvular variants is linked to French, in other areas it 
is considered to be a product of the dialect continuum with German. Due to its 
contact history with French and its relationship with German dialects, Flanders 
seems to be an ideal testing ground to get more insight in the controversy 
around the spread of uvular [r] in Western-Europe. To provide support for the 
idea that there is a general rise of uvular [n] in Flanders and to get more insight 
in the mechanisms underlying this change, the socio-geographical distribution 
of alveolar and uvular /1/ needs to be investigated more systematically. 

The data we present come from three rich data sources, two more traditional 
dialect surveys (RND, GT RP) and one socio-geographic survey (RAS) based on 
a sociolinguistic approach. We will discuss the data and results in Sections 3 to 
5. [n Section 6 we will argue that we can put the results of the three data sources 
on a time scale, in order to make the rise and spread of uvular [n] visible. It is not 
only observed at the borders of the language area, but there are also patterns of 
internal diffusion, as we will argue. We also need to discuss why uvular [r] has 
the prestige it appears to have acquired, a question we will address in Section 7. 


2. Flanders and uvular [R] 


In this contribution the Flemish provinces will be used to interpret the regional 
distribution of (r). From left to right (west to east) in Map 1 we find: 


-  West-Flanders: bordering the North Sea and France; 

-  East-Flanders: bordering the south-western part of the Netherlands and 
Wallonia; 

- Antwerp: bordering the southern part of the Netherlands; 

-  Flemish-Brabant: bordering Wallonia and containing Brussels; 

- Limburg: bordering the southern part of the Netherlands. 


France and Wallonia are French speaking. Brussels is officially bilingual 
French-Dutch, French being the dominant language (Janssens 2008), but the 
local dialect is Brabant Dutch (De Vriendt & Willemyns 1987). Belgian French 
has mainly uvular variants (Demolin 2001). Alveolar variants occasionally show 
up in different regions but they are considered archaisms (Hambye 2005:208). 
Unfortunately, the quality of /r/ is not transcribed in the linguistic atlas of 
Wallonia (Remacle 1953:59). 
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From: Vandeputte, Omer. 1983. Dutch: the language of twenty million Dutch and Flemish 
people. Rekkem: Stichting Ons Erfdeel. © Ons Erfdeel vzw. Reprinted with permission. 


In dialectological studies uvular [n] is systematically reported for the eastern part 
of Limburg. Grootaers (1951:40) states that all Flemish dialects have an alveolar 
realization except for the Limburg dialects. The same conclusion can be found 
in Weijnen (1991), who adds that the uvular [n] is characteristic of the north- 
eastern part of Limburg. As for the provinces of Flemish Brabant and Antwerp, 
Brussels is marked as a uvular area (Mazereel 1931; Baetens-Beardsmore 1971; 
Weijnen 1991; De Vriendt 2004). Uvular [n] is reported for Aarschot (Pauwels 
1958) and Turnhout (Aerts 1955). More recently, De Schutter (1999:304) 
observes its occurrence in the city of Antwerp and claims that it is becoming 
the norm in the Antwerp urban dialect, which is the most prestigious dialect in 
Flanders (1b:303). However, in De Schutter & Nuyts (2005), uvular [n] is not 
mentioned as a characteristic of the urban dialect. In East-Flanders, uvular [n] is 
known as a stereotype of the Ghent urban dialect. De Gruyter (1907) observed 
the new variant in the beginning of the 20? century. Rogier (1994) documented 
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its spread in the surrounding suburbs and villages. In West-Flanders no special 
observations about the pronunciation of /r/ are made. 

At the same time, in the 20° century, uvular [r] has often been considered a 
speech deficit (De Schutter 1999:304; Verstraeten & Van de Velde 2001:46; 
Tops 2009:198). Consequently, a lot of children were sent for treatment to 
a speech therapist, although Blancquaert (1934:114) had argued in favor of 
tolerance towards uvular [r]. At the Flemish broadcasting corporation speakers 
with uvular [r] did not pass the microphone test (Van de Velde 1996:126) and 
especially schools for drama and eloquence banned uvular [n]. The Belgian film 
maker and author De Kuyper (1993:37-39) describes how he was banned from 
a Flemish music academy due to his French /r/ (i.e, uvular [n]). Nowadays, 
policy has changed and uvular [n] (except for variants with strong friction, as 
occurring in the Ghent dialect) is accepted at the broadcasting corporation 
(Tops 2009:198). Since the 1960’s text books for speech therapists have shown 
more tolerance toward uvular [n], but even today alveolar [r] is still preferred 
“for technical reasons” (Timmermans 2004:33-34). Interestingly, Van Bezooijen 
(2003:83) suggests that some speakers are simply not able to produce an alveolar 
trill and use a uvular trill instead and she sees this genetic characteristic as one 
of the mechanisms involved in the spread of uvular [n]. 


3. RND 


‘The dialectologist Blancquaert started collecting data in 1922 for the first part of 
the Reeks Nederlandse Dialectatlassen (RND), inspired by Gilliéron' work for the 
Atlas Linguistique de la France (Hagen 1995:81). Later, RND developed into 
a series of dialect atlases covering the complete Dutch language area, with 1956 
localities and 4012 informants. The volumes covering Flanders were published 
between 1925 and 1962 and the data was collected between 1922 and 1953 
(Reker 1997:51). A standard questionnaire, mainly consisting of sentences to be 
translated in the local dialect, was used for data collection by experienced field 
workers who transcribed — on the spot — the sentences in (narrow) IPA (bear 
in mind that portable recording equipment was only introduced in the 19505). 
‘The Flemish data used in this study were mainly collected by Blancquaert and 
collaborators trained by him. For our analysis, we will focus on the 859 localities 
that are currently situated in the Flemish Community and Brussels Capital 
Region (the linguistic border was fixed in Belgium in 1962). 

Blancquaert first selected localities with at least 2000 inhabitants, smaller places 
were added in transition zones and if distances between places were larger than 


The spreading of uvular [R] in Flanders 


5 km (Hagen 1995:81). Blancquaert did not opt for the traditional NORMs 
(non-mobile, older rural males; the common type of informant in dialect 
geography; cf. Chambers & Trudgill 1980:33). Instead, he had a preference for 
informants between 20 and 40 years (Blancquaert 1948:24), who grew up in 
the locality and had local parents. About half of the informants belonged to 
the middle class and almost one quarter were women. In most localities two 
to three dialect speakers served as informants. It is obvious that RND aimed 
at collecting the — at the moment of data collection — contemporary use of 
the dialect. For a more detailed discussion of the characteristics of the RND 
informants in comparison with other dialect atlases we refer to Johnston (1985). 
The transcribers distinguished two variants of /r/: r met tongpunttrillingen (x with 
trills of the tongue tip, i.e. [r]) and gebrowwde r (‘burred r’, i.e. a uvular realization). 
Our analysis is based on lexical items from three sentences of the questionnaire: 
36 (peer ‘pear’), 85 (rijkdom ‘wealth’) and 86 (dorst thirst") of the questionnaire. We 
selected these items as they were not marked by variation in lexical form, which 
means that they kept the /r/ most of the times. Lexical variants and realizations 
in which (r) was deleted were coded as missing values. For each dialect an index 
score (percentage) was calculated between 0 (alveolar) and 100 (uvular). 

In total, 859 Flemish dialects were incorporated. Map 2 gives an overview of 
the distribution of (r) in Flanders. 803 of them (93.596) having alveolar [r], 56 
(6.5%) having uvular [n]. None of the places had variation in the transcription 
of place of articulation of (r), and the transcribers had only made remarks for 
two dialects (0.296) on local variation in the pronunciation of /r/. 


Map 2 — Geographical distribution of (r) in RND data in Flanders (859 localities, collected 
between 1922-1953). White dots: alveolar [r]; black squares uvular [R]. 


The overall impression of Map 2 is that the uvular [n] is more characteristic 
of the periphery of the Flanders area than of its core parts. No larger urban 
centers are involved, except for bilingual Brussels. Map 2 shows that uvular [r] 
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is present up in the following areas (for a more detailed listing of the places, see 
Tops 2009:204-5): 


- . 85 places in the east of the Limburg province, including three places in the 
Voer-region; 

- . Five places in West-Flanders, one isolated, one near the coast and French- 
Flanders, four on the Dutch- French language border; 

- Nine places in Brussels and immediate surroundings; 

- In the rest of Flemish-Brabant: two places on the Dutch-French language 
border and five places in the east of the province. 


4. GTRP 


'Ihe Goeman-Taeldeman project aimed at collecting a phonological and 
morphological corpus of the Dutch dialects (Goeman & Taeldeman 1996). 
‘The coded transcriptions are available as the Goeman-Taeldeman-Van Reenen 
database (GT RP). In total, 613 dialects were recorded in the Dutch language 
area. 189 localities were selected in Flanders, which is much less dense than 
RND. Goeman (1999:58-70) presents an analysis of the social characteristics of 
the Dutch GTRP informants. Unfortunately, there is no detailed information 
published about the Flemish part of GTRP. 

For each locality in Flanders there was one informant. All the Flemish data were 
collected between 1990 and 1993. Participants were not exclusively NORMs 
(Goeman & Taeldeman 1996:52-53). Urban dialects were also included in 
the sample, and whether a man or a woman was selected depended mainly 
on practical issues as availability and willingness to participate. Non-educated 
informants were only selected if they had enough metalinguistic awareness and 
insight in the aim of the questionnaire. Almost all participants were between 
50 and 75 years old at the moment of recording. The main criterion was that 
participants were indigenous (i.e., grew up locally), preferably of indigenous 
parents and speaking the dialect on a regular basis (Goeman & Taeldeman 
1996:53). 

Almost all Flemish data were collected and transcribed by two field workers. 
The questionnaire contained 1867 items and was sent to the participants about 
a week in advance. The recordings took about half a day for each informant. For 
our analysis we selected the 161 singular nouns containing /r/. In the Flemish 
data the transcribers only made a distinction between alveolar, uvular and deleted 
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variants of /1/!. The deleted variants variants were not taken into consideration 
for the calculation of the index scores on the front-back dimension. Map 3 gives 
an overview of the distribution of /r/ in Flanders on the basis of the GTRP 
data. 
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Map 3 - Geographical distribution of (r) in GTRP data in Flanders (189 localities, collected 
between 1922-1953). White dots: homogeneous alveolar localities; black squares: 
homogeneous uvular localities; grey triangles: non-homogenous place of articulation. 


Also in these dialect data places with alveolar [r] are dominant (167/189, 
84.3%). Thirteen places only show uvular [n] (6.9%), and nine places/informants 
(4.896) mix [r] and [n], but it should be noted that they almost exclusively use 
uvulars. Again, uvular [n], like in RND, occurs in the peripheral area, with the 
exception of one important urban centre in the heart of East Flanders: Ghent. 
‘The occurrence of the uvular [a] can be summarized as follows (for a listing of 
the places, see Tops 2009:206): 


- 18 places in the east of Limburg, including one locality in the Voer-region; 

- Two places on the Dutch-French language border; 

- Two cities: Ghent (East Flanders) and Ypres (West Flanders); 

- lt is important to remark that Brussels and its surroundings was not 
included in the places selected in GT'RP. 


l In the Netherlands the phonetic transcriptions are much more detailed and includes variation in manner of 
articulation and voicing. 
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5. The Rapid Anonymous Survey (RAS) 


Tops (2009) conducted a large-scale sociolinguistic survey in Flanders collecting 
data on the pronunciation of /r/ for 1,912 speakers distributed over 89 localities 
in Flanders. The aim of the study was to get insight in the socio-geographical 
distribution of /r/. The technique was inspired by Labov's famous department 
store study of /r/ in New York City (Labov 1966). The rapid anonymous survey 
technique is well known thanks to the incorporation of Labov's study in most 
introductory textbooks in linguistics and sociolinguistics. The technique is also 
widely popular among undergraduate students taking their first steps in the 
study of language variation and change. Surprisingly, this research method is 
hardly used in international publications, Horvath & Horvath’s work on /1/ 
vocalization in New Zealand and Australian English being a rare and successful 
exception and adaptation (Horvath & Horvath 2001). An important adaptation 
to Labov’s work is that a short word list (shorter than Horvath & Horvath’s) 
was used and that the speech was recorded. 

‘The selection of the localities was done in two steps. In the first step, 39 localities 
were selected equally distributed over the whole of Flanders, including the 
main Flemish urban centers Antwerp, Bruges, Ghent and Genk, towns with 
a regional function (as defined in the official spatial planning documents; cf. 
www.ruimtelijkeordening.be) and villages. In the second step of data collection, 
these localities were supplemented by 50 localities selected in areas where 
alveolar and uvular variants of (r) co-occur. These regions were the surroundings 
of Ghent, the east of Limburg (the boundary of the old uvular [n] area) and an 
area north of Antwerp (Hoogstraten and surroundings). 

People walking on the street or shopping were approached in the 89 localities by 
a field worker speaking standard Dutch with the request to participate in a study 
on voice quality of Brussels University (Vrije Universiteit Brussel) that would 
take less than 2 minutes of their time. This guise was used as a justification for 
the recording of the participant's speech and kept the purpose of the research 
hidden. If somebody agreed to participate, the interviewer asked their age and 
whether they were local (only local participants were selected for the analysis). 
‘They had to read 20 words, listed on four cards, offered in a random order. The 
aim was to fill a quota sample of four groups of five participants: two age groups 
(old vs. young) by gender. The age ranges were 16 to 35 (young) and older than 
35 (old). When there were problems to fill the quota the locality in question was 
revisited. With only a few exceptions (even after returning twice to the same 
place), the quota could be filled fairly well. The total number of participants was 
1,912. 
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The speech was recorded on a portable TASCAM DA-P1 recorder, with 
a Sehnheiser MD425 dynamic supercardioid hand-held microphone. ‘The 
recordings did not only have the sound quality required for reliable auditory 
analysis, most of them were also good enough for acoustic analyses (cf. Tops 
2009:21-120). 

'Ihe word list to be read aloud contained 20 monosyllabic words, distributed 
on ve cards. Eight words were (r)-less distractors. All the cards ended with a 
distractor, to avoid end of lists eects aecting the realization of (r). Twelve words 
contained our variable (r): three in onset position (reep, rood, reus), three in coda 
position (gaar, zuur, voer), three in an onset cluster starting with [t] (troon, trein, 
trui) and three in a coda cluster ending with [t] (buurt, kaart, woord). 

The number of usable (r) realizations collected was 22,720. Twelve variants — 
along the dimensions of trilling, friction, place of articulation, and voicing — were 
distinguished on the basis of a combination of auditory and spectral analyses by the 
second author. However, the methodology was developed in collaboration with a 
number of specialists in the fields of phonetics, dialectology and language variation 
and change. In cases of doubt, these specialists were also consulted for the coding 
of the variants. It resulted in six alveolar, four uvular variants, schwa and a null 
realization. Per locality the number of variants ranged between four and twelve. 74 
localities (83.1%) had eight to eleven variants. This shows the enormous variability 
of the pronunciation of /r/ within localities. Some variants were particularly related 
to the position of (r) in the word, but we will not further investigate the role of 
the linguistic context in this contribution. Except for the front-back distinction 
of alveolar vs. uvular, there were no socio-geographic patterns in the distribution 
of the variants. Therefore, and for the sake of comparison with RND and GT RP 
data we will focus on the alveolar-uvular dichotomy in the remainder of this paper. 
We found 15,623 alveolar realizations (68.8%), 7,044 uvular realizations (31.0%), 
11 schwas (0.0%), and 55 deletions (0.2%). We computed for each speaker a front- 
back index or percentage, excluding the schwa and the null realization. A score of 
0 means only alveolar variants, a score of 100 means only uvular realizations. Ihe 
next step was to compute the average percentages per locality. The distribution of 
the scores for all localities can be found in Figure 1. 

How is the distinction of alveolar and uvular variants distributed over speakers 
and localities? There is a lot of variation between speakers, ranging from 
completely alveolar to completely uvular, but there is a remarkable absence 
of variation within speakers. Individual speakers turn out not to mix uvular 
and alveolar variants in our recordings. The number of speakers who mix both 
variant types is 102 (5.3%), of whom only 38 (2.0%) have a mix of 20% or more 
of one variant and 80% or less of the other one. 
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Figure 1 — The frequency distribution of the average front/back scores for the 89 localities. 


Figure 1 shows that most localities have more alveolar realizations, as is also 
clear from the mean percentage of 31.196. Most localities have a mixture of 
front and back variants. Only seven places have exclusively alveolar [r], and 
only two uvular realizations, which implies that uvular [r] is present in almost 
all localities. When localities have a score somewhere between 0 and 100, these 
localities have a mix of uvular and alveolar speakers, as the number of mixed 
speakers is low (5.396). In most mixed places alveolar realizations are dominant 
(60 scores below 50 and above 0) and only eight places have a score of 8096 or 
more on the front-back dimension. The standard deviation of 45.5 reflects a 
high degree of variability between the localities. 

When a change is ongoing from alveolar to uvular in Flanders, the localities 
will show an increase in the uvular index when younger and older speakers are 
compared. Figure 2 presents the comparison between the two age groups per 
locality. The indices of the younger group of participants are represented on the 
vertical axis and the indices of older group of participants on the horizontal 
axis. The diagonal indicates the position of the indices if the two groups would 
have the same scores. A stable age distribution — indicating absence of change in 
progress — would produce scores oscillating randomly near the diagonal. 
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Figure 2 — Scattergram of the mean front-back index (percentages of uvular [R]) of the younger 
group versus the older group for 88 localities (the young age group was lacking in one of the 
localities). 


‘The pattern in the scattergram of Figure 2 is remarkably clear. All dots — with 
only a few exceptions — are in the upper part. Younger speakers have more uvular 
realizations than older speakers. This sharp shift towards uvular realizations is 
found in all provinces (Brabant, Antwerp, Limburg, East-Flanders), except 
West-Flanders where all localities had low scores (about or less than 2096), both 
for the younger and older age groups, indicating that this region has mainly 
alveolar realizations. Some localities have a strong shift, from a percentage of 
(about) 096 for the older age groups to more than 8096 for the younger age 
group. The few scores in the area under the diagonal lower part might be the 
result of sampling fluctuation. 

The geographical age shift can be illustrated with two maps: Map 4 gives the 
geographical distribution for the old speakers, Map 5 for the young ones. The 
darker the symbol, the higher the percentage of uvular [r] use in a locality. 
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Map 4 - Geographical distribution of the front-back index for the older age groups in RAS, 
ranging in gray between white symbols (096, only alveolar realizations) and black symbols 
(100%, only uvular realizations). 


Map 4 shows the use of uvular [n] by older speakers. Uvulars are frequently 
used in: 


- he east of the province of Limburg; 
- Ghent and surroundings. 


Additionally, uvular [r] also shows up in: 


- A number of places on the Dutch-French language border; 
- Almost all over the province of Flemish-Brabant; 
- In the north of the province of Antwerp. 


Map 5 - Geographical distribution of the front-back index for the younger age groups in RAS, 
ranging in gray between white symbols (096, only alveolar realizations) and black symbols 
(100%, only uvular realizations). 


The spreading of uvular [R] in Flanders 


Map 5 for the younger age groups shows a strong geographical expansion of the 
uvular realizations in comparison with the older age groups in Map 4. 


- There is an increase of the use of uvulars in Limburg: quantitatively in the 
old uvular places, geographically towards the west to places in Limburg 
that were originally outside the uvular area; 

- The Ghent area has become larger, indicating geographical diffusion from 
the city of Ghent; 

- The localities in Brabant show strong increase in the use of uvular 
realizations, all over the province; 

- ‘The localities north of Antwerp show an increase in the use of uvular [n]. 


6. Mapping the rise of uvular (R) in Flanders: Integrating 
the three data sources 


In the preceding sections, three different data sources on the realization of /r/ 
were discussed. At first sight, the geographical patterns seem to fit nevertheless. 
No obvious contradictions show up in the geographical patterns observed, as the 
same areas consistently come out as centers of gravity and perhaps expansion 
of uvular [n]. Can we push the analysis one step further by reconstructing a 
time scale on the basis of our data sets that gives a more detailed and precise 
impression of the rise of uvular [r] in Flanders? That raises of course the question 
of the comparability of our data sources. Therefore we look at the similarities 
and differences between RND, GTRP and RAS. 

In Table 1 we have defined six pivotal characteristics that are used to evaluate our 
three data sources. The first characteristic is defined as dialect as target. RND and 
GTRP obviously have the local dialect as the target variety: the participants were 
asked to translate sentences and words in their local dialect and were selected for 
this purpose. In RAS the target variety was not made explicit to the participants 
and they did not know the real purpose of the study, as the guise of a voice quality 
study was used. However, the whole context of the study aimed at eliciting 
non-dialect / standard speech: the informants were addressed in standard Dutch by 
aresearcher who identified as being from the university and were asked to read aloud 
a word list with (standard) Dutch words. Reading out loud single words is a regular 
activity in primary school. The visible use of recording equipment increased speech 
monitoring and the use of standard speech. None of the informants translated the 
words spontaneously in their local dialect, as was clear from the quality of their 
vowels. Therefore RAS is characterized as minus for dialect as target. 
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RND GRTP RAS 

DIALECT AS TARGET * + - 
LOCAL VARIATION - E + 
ÍsOLATED WORDS = + + 
METALINGUISTIC AWARENESS + + = 
EXPERTS + + - 
YEAR OF BIRTH + 1910 + 1930 Old: +1950 

Young: + 1980 


Table 1 — Characteristics of the three data sources. 


The RAS study aimed at charting local variation by sampling individual speakers, 
not being selected because of their expertise as in GTRP and RND. GTRP selected 
only one speaker per locality, who was presumed to be an expert in the local dialect. 
Although being older, RND is more sociolinguistic in its approach (Hagen 1995). 
‘The field workers commonly worked with two or three informants per locality and 
asked questions about the local sociolinguistic situation (e.g., do people also speak 
standard Dutch or French, differences within the dialect, immigrants from other 
areas). We found a couple of remarks about the pronunciation of /1/ in the Flemish 
localities, but this did not lead to variation in the transcriptions of /r/ within 
localities. Therefore, RND is also marked as minus for local variation. Variation 
between speakers within localities is at the heart of the RAS data collection, aiming 
at 20 informants per locality. RAS used isolated words as triggers for eliciting data, 
GTRP isolated words and short phrases, RND sentences. 

The RAS survey pretended to be a study on voice quality, not of dialect or 
standard pronunciation. That means that the informants focus less on their 
own speech and language than in RND and GTRP, which openly direct the 
awareness of the informants on the distinction between standard and dialect. 
Furthermore RND and GTRP selected expert speakers of the local dialect, 
while RAS selected speakers that were just local, without any evaluation of their 
speech characteristics. 

Despite these differences, there is one striking correspondence in the outcomes 
of the three surveys. There is hardly any variation on the front-back dimension 
of /r/ on the individual level. For RAS — based on twelve tokens of (r) — we 
found a percentage of 5.3% mixed speakers. For GT'RP — 161 (r) tokens — 
variation was found in only two localities (1.196); no variation was found in the 
other 187 localities, despite the large number of words per locality. However, 
it must be mentioned that this low number of localities showing variation for 
place of articulation can be partly a result of the transcription method used 
in Flanders (Rob Belemans, p.c.). None of the 189 places in RND showed 
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variation for the three words investigated, and only for two places a remark was 
made about the pronunciation of /r/. 

How can we put the three surveys on a time scale? We can globally estimate 
the average year of birth of the informants in the three surveys. For RND the 
age range of 20 to 40 years of age is reported as the default, which gives an 
average of 30 years of age. The time range of data collection in Flanders was 
between 1922 and 1953, the majority of the data being gathered in the 1920's 
and 1930’s. That results in 1930 as a rough estimate of the year of birth of RND 
participants (1940 — 30 years). GTRP was recorded between 1990 and 1993, 
with an average age of 60.5 years. That gives an estimate of 1930 as year of 
birth. The RAS data were collected between 2002-2004, the average age of the 
participants in the young age group is 24 and of the old group 54. That gives an 
estimated average year of birth for the younger group of 1980 and for the older 
group of 1950. These outcomes were included in Table 1. 


Map 6 - The spreading of uvular [R] in Flanders between 1930 and 2000. Black: areas RND (1930); 
dark grey: areas GTRP (1950); grey: areas RAS old (1970); light grey: areas RAS young (2000). 


Estimating a time slot for each of the data sources we added 20 years to the 
estimated average year of birth, to indicate that the group involved had reached 
the adult life stage. This results in the following periods: 1930 (RND), 1950 
(GTRP), 1970 (RAS old) and 2000 (RAS young). Map 6 visualizes the 
geographical expansion of the use of uvular [n] in Flanders. In stead of outcomes 
on individual localities as in Maps 2 to 5, we have indicated areas where uvular 
[x] shows up. The uvular [n] areas in the oldest data (RND 1930) are marked in 
black. For the other sources/periods we used shades of gray, ranging from dark 
grey (1950) to light grey (2000). 
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The following geographical patterns show up: 


-  East-Limburg: The border of this old uvular [r] area moves gradually 
westwards; 

- Brussel: Gradual expansion to its surroundings; 

- Ghent: Gradual expansion from the city center to the surrounding area; 

- North of Antwerp: a new and expanding zone; 

-  West-Flanders: uvular [r] is almost completely absent; 

- [n] islands — arising in different periods — pop up in different areas. Some of 
them are close to the linguistic border with French. It is unclear from Map 
6 whether the presence of uvular [n] places in the East of Flemish-Brabant 
are linked to the expansion in Limburg and Brussels. 


7. Conclusion and discussion 


Our three data sources provided substantial and complementary information 
on the occurrence and rise of uvular [n] in Flanders. The dialect data sources of 
RND and GTRP exemplified that uvular [n] already had acquired a (modest) 
position in the dialects of Flanders in the first half of the 20* century. This was 
confirmed by observations in the dialectological literature on specific dialects. 
The socio-geographical RAS data made clear that the use of uvular [n] sharply 
increased, as witnessed in particular by the data collected on the younger age 
groups. Whereas uvular [R] was particularly found in the periphery of the 
language area covered by the corpus (East-Limburg and occasionally near the 
language border with French) in the RND and GTRP data, it is present in all 
provinces of Flanders nowadays, with the exception of West Flanders, where 
uvular [n] only occasionally shows up. It is interesting to note that in the literature 
Bruges, the largest urban area in West-Flanders, has been mentioned several 
times as a place where uvular [r] was observed (Weijnen 1991:190). Bruges was 
part of the RAS database, and the mixed picture of Bruges is confirmed by the 
data. There were 18 homogeneous alveolar speakers, but two outspoken uvular 
speakers as well (one old, the other one being young). According to our RAS 
data Bruges seems not to be an expansion center of uvular [n], but its presence 
and status certainly needs to be investigated in more detail. 

The three databases together, as explained in Section 6, reveal the patterns of an 
ongoing change from alveolar [r] towards uvular [n], triggering automatically 
the question about the origins of this change. Several origins or sources seem to 
be present. The most straightforward explanation can be given for uvular [n] in 
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Limburg, as it is part ofa larger and older Germanic dialect continuum marked 
by the uvular [r]. Van Reenen (1994) concludes on the basis of the GTRP 
data from the Netherlands, that Dutch Limburg is the core area of uvular [n], 
but he also observed the occurrence of uvular [n] in the Dutch province of 
North Brabant, with Breda as the center of expansion. Mees & Collins (1982) 
observed that uvular [r] is common in educated speech in large parts of the 
Netherlands, including Limburg and North-Brabant. 

The Dutch North-Brabant area borders the Antwerp province and might be the 
trigger for the emergence of uvular [n] in the north of the province of Antwerp. 
Since the 1980's this area (and also the Belgian East-Limburg area near the 
Dutch border) has seen a large influx of immigrants from the Netherlands, 
showing an increasing trend in the last two decades (Sumresearch 2006; WODC 
2009). In the beginning, alot of Dutch immigrants were retired, wealthy people, 
motivated by economic reasons (cheaper housing, tax regulations). Since the 
1990’s a broader section of the Dutch population emigrates to Flanders, often 
for economic reasons. It should be noted that a substantial part of them remain 
closely attached to the Netherlands, by for instance a job in the Netherlands 
(WODC 2009:23). Ihe number of Dutch immigrants per 10000 inhabitants 
ranged between 50 and 80 per 10000 inhabitants in the period 1997-2003 in 
the area bordering the Netherlands (Sumresearch 2006:16). Specifically for the 
region North of Antwerp a yearly increase of about 980 inhabitants coming 
from the Netherlands is observed, while at the same time about 600 inhabitants 
leave the area for other Belgian places (Sumresearch 2006:17). It is not unlikely 
that the increasing presence of people from the Netherlands, of whom many 
come from a uvular [R] area and stay connected to it (e.g., by working in the 
Netherlands; WODC 2009:23), are a factor in the spread of uvular [n] in the 
region north of Antwerp. Of course, the prestige of Antwerp dialect, and the 
increasing use of [r] in Antwerp in recent years (De Schutter 1999), will also 
add to the increased use of uvular [r] north of Antwerp. 

The occurrence of the uvular [n] along the German-Romance language border 
can be brought about by the impact of varieties of French having uvular [n]. 
Kruijsen (1995) established stronger tendencies of borrowing from French 
along the language border in the province of Limburg. This effect was stronger 
the closer a place was to the language border. He admittedly investigated lexical 
borrowing, whereas the /r/ pronunciation is in fact a structural borrowing. 
Kruijsen (1995) found however some more general patterns of borrowing that 
do not exclude the borrowing of uvular [n]. Ihe mechanism of borrowing can 
help to explain why places shift in reporting a uvular [r] in comparing the three 
data sources. The uvular [r] may arise because of bilingual speakers and language 
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contact, but may disappear by counter-effects as well. An intensive language 
contact situation, active bilingualism, in combination with the dominance of 
the French language offers a plausible platform for the occurrence of uvular [n] 
in Brussels and its surroundings. Patterns of diffusion may have played and still 
may play a role in propagating the uvular [n] in the municipalities of Brussels. 
The Ghent uvular [r] is a famous case in the Dutch dialectological literature, 
as is the earlier mentioned city of Ihe Hague in the Netherlands. The origin 
and prestige of the uvular [r] is found in French, the language learned and 
used by the local bourgeoisie who started to use this form in their dialect and 
standard Dutch. The RAS data testify how strong nowadays the uvular [n] is 
expanding around Ghent (see also Rogier 1994). The uvular [r] apparently 
acquired prestige in Ghent and its surroundings. 

‘The situation in the Flemish Brabant places is much less transparent. Perhaps 
Brussels had an impact, perhaps it was the same mechanism, only later, as in 
Ghent and The Hague, perhaps a combination of both. It may have its origin in 
an urban hierarchy in which Brussels has prestige. Ihe hierarchy matches with 
the direction and patterns of diffusion in Flemish- Brabant. 

‘The literature often reports how rapid the spreading of the uvular [a] must have 
taken place in France and Germany. The RAS data confirm in the age patterns 
found (see Figure 2) that the change can be fairly rapid and may possess more 
the contours of a sharp shift than being marked by a gradual curve. It means the 
geographical and a social embedding can develop or construct pathways along 
which language change can proceed with a strong and intensive impetus. In 
such rapid and vigorous changes it is likely that children play an important role, 
just as in the spreading of approximant /r/ in the Netherlands (Van Bezooijen 
2005). The /1/ sound being one of the latest to be acquired, it seems to be 
sensitive to phonetic adaptation in childhood (Van Bezooijen 2005:29) and 
children might take over uvular [n] from other children, not from their parents. 
Also the fact that (some of the) uvular variants are easily distinguishable from 
alveolar variants, might be a factor in the speed of this change. When uvular [n] 
develops overt positive prestige, it may rise radically in speech communities at 
the expense of the alveolar variants. It explains at the same the rigor with which 
uvular [n] was contested, as mentioned often in the literature. 

The ease of perception relates to the question of the linguistic embedding of 
the alveolar and uvular variants. As to the linguistic embedding of the uvular 
[r] our data give some relevant outcomes. The number of mixed speakers is 
remarkably low. ‘There is no mention of mixing by individual speakers in the 
RND data, but according to the remarks in the atlas one of the three informants 
in Bree (Limburg) is an alveolar [r] speaker, uvular [n] being the ‘dominant’ 
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variant and in Strombeek-Bever (near Brussels) one informant — a primary 
school teacher — speaks with [r], while [n] is ‘common’. The GTRP data contain 
two speakers with mixed realization. The most obvious explanation, given the 
strong regional concentration and the remarks in the notes of RND, seems to be 
the impact of the alveolar standard norm. The RAS data contained 102 mixed 
speakers (5.3%). They occurred in all kinds of localities and we could not trace a 
further explanatory pattern in their occurrence. We would have expected more 
mixed speakers in mixed localities, but we did not find such distribution. On 
the other hand, the variability within places was large, which sometimes was the 
consequence of the differences between the two age groups investigated. 

In the embedding of the change form alveolar to uvular we found no co- 
variation patterns with other variants having a strengthening or mediating 
effect in the transition from alveolar to uvular variants, for instance by using 
a trilled uvular [n] first, followed by the occurrence of fricative variants, or by 
using uvular variant first in onset or coda. No traces of specific tracks or routes 
came about. This may explain the absence of mixing speakers. This suggests 
that it is worthwhile to perform a more detailed study on mixed speakers. At 
any rate, the high variability between speakers and the ease of perception of the 
alveolar — uvular distinction make the /r/ a perfect candidate to get involved in 
social patterning. However, it should be noted that not all uvulars are easy to 
distinguish from alveolars, and that this might also play a role in this change in 
progress. 

The RAS data showed that there are hardly any homogeneous communities with 
respect to /r/. That means that /r/ is inherently marked by variation, that uvular 
[n] is almost everywhere and that these variations may start to coincide to reach 
the stage of an incipient change. External forces or factors (borrowing from 
French, neighboring dialects, migration from the Netherlands) may contribute 
to strengthen patterns of co-incidence, resulting in a rapid and massive spread 
of uvular [n] over Flanders. Some of our explanations are very tentative, and to 
fully understand the rise of uvular [R] more research is needed to understand the 
role of and interaction between the factors suggested in this paper. 
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Montreal French: An exploration of stylistic 
conditioning in a sound change in progress 
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Abstract 

This chapter focusses on the middle phase of a very rapid change, exploring the relation 
between the phonological conditioning and the stylistic conditioning of the variation 
across the lifespan with regard to the situation of the speaker in the change spectrum. An 
analysis of the real-time change from apical [r] to posterior [r] in Montreal French for 
two speakers across the lifespan illustrates that the sensitivity to stylistic conditioning is 
a complex phenomenon. Although both speakers acquired the apical variant as children 
they are not equally sensitive to the stylistic environment. Further research using a 
combination of trend and panel study needs to be done on other variables involved 
in the process of change if we want to better understand the relation between stylistic 


markedness and the process of change. 


1. Introduction! 


Previous studies of sound change have indicated that change tends to proceed 
incrementally. Ihe many ongoing sound changes in Philadelphia vowels, for 
example, show a regular progression across generations in the elegant regressions 
of Labov (2001). Regular, incremental progression also appears to be the order 
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of the day in the massive vowel rotation of the Northern Cities Shift (Labov 
et al. 1972; Labov 1994), in the retrograde shift of the Parisian vowels (Lennig 
1978), in the raising of (o) in Korean (Chae 1995) and many other cases. 
With respect to consonants, incremental change seems less obvious. More 
discrete in nature, consonantal change might be susceptible to more dramatic 
or rapid change. Here again, available studies point to quantitative alteration 
such that the innovative form becomes increasingly dominant over time (e.g. 
Cedergren 1973b, 1988; Labov 1994; Haeri 1994)". 

‘This established finding, however, does not imply that sound change must operate 
incrementally. Our research on the replacement of Montreal French apical [r] 
by posterior [R] in the 1960s — 1990s has indicated a drastically different pattern 
for the implementation of this change (Sankoff et al. 2001; Blondeau et al. 
2003; Sankoff & Blondeau 2007). In this change from above, many individual 
speakers have passed from a highly variable use of both [r] and [R], to a stage in 
which they are categorical or near-categorical users of [r], without having used 
any phonetically intermediate variants. 

In the current paper, we examine the linguistic behavior of two speakers across 
the lifespan in order to illuminate the role of stylistic variation in different 
phases of the change. This detailed analysis allows us to explore the relation 
between the phonological and stylistic conditioning with regard to the situation 
of the speaker in the change spectrum. 

After providing a summary of our previous research on the [r] — [R] change 
in Montreal, and explaining our methodology, the article concentrates on the 
individual variability, more specifically on the stylistic conditioning of the 
variation. 


2. Our previous research on the [r] ^ [R] change in Montreal 


In studying the real-time change from apical [r] to posterior [R] in Montreal 
French, we have employed both trend and panel comparisons. This was made 
possible through the use of three corpora, recorded in 1971, 1984 and 1995 


(Sankoff & Sankoff 1973; Thibault et al. 1990; Vincent et al. 1995). Our data 
on Montreal include 120 speakers recorded in 1971, and 60 of the same people 


? One striking exception to the gradual character of changing relative frequencies in consonantal change is 


documented in Trudgill's re-study of Norwich, in the merger of /f/ and /th/, and non-initial /v/ and /d/. He 
found that "not a single speaker in the 1968 sample showed even one instance of this phenomenon, [but] of 
people born between 1959 and 1973, 41% have the merger variably; and 2096 have a total merger, i.e. /0/ 
has been totally lost from their consonantal inventories" (1988:43). Many variable consonantal alternations 
are, of course not involved in change, e.g. the alternation in English of (th) and (dh) with affricates and 
stops in Philadelphia (Labov 2001, Chapter 3); and Spanish s>h—0 in Panama (Cedergren 19733). 
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recorded again in 1984. In addition, 12 younger speakers were added in 1984. 
Of the original speakers, 12 were recorded again in 1995, along with 2 from the 
younger 1984 cohort. 

Our first paper on (r) (Sankoff et al. 2001) was based entirely on panel 
comparisons of individuals selected from the three corpora. Making maximal 
use of the reduced 1995 corpus, we studied the 14 speakers carried through 
1995, along with a further 11 for whom comparisons were possible between 
1971 and 1984 only. We were surprised to discover that a sizeable minority of 
speakers had altered their usage significantly over the years, and decided that 
an expanded group of subjects was necessary in order to understand the course 
of the change more fully, as it was implemented by individual speakers. In a 
second study, we examined the trajectories of several individuals, comparing 
their implementation of the [r] — [R] change with their adoption of an ongoing 
morphological change from above (Blondeau et al. 2003). In a third study, an 
enlarged sample was designed to make trend vs. panel comparisons over the 
1971-1984 period (Sankoff & Blondeau 2007). This paper clearly shows the 
change as being implemented chiefly by a younger cohort of speakers joining 
the pool of [n] users, and that change over the lifespan by individual speakers is 
part of the general movement, but not the driving force. 


3. Methodology 


As in our previous research, this paper reports on the two major variants of 
interest in the ongoing change’: 


a) The apical variant, [r], whether flapped or trilled; and 
b) The posterior [r], which included both trills and fricatives, the latter often 
very weakly articulated. 


For each speech sample, we followed Clermont & Cedergren (1979) in 
calculating the percentage of [r] as a function of the two consonantal tokens, 
according to the formula [n] / ([R] + [r]) * 100. We then carried out x? analysis 
to verify whether codings were significantly different, taking the .05 level as 
our baseline. When two codings were more dissimilar than this, we had a third 


5 In addition to these two, we coded for four other variants: cases which were too indistinct to hear were 


coded as indistinct, and removed from further consideration; deleted (r) in final clusters were coded as de- 
leted, a fifth variant was the rather rare retroflex known locally as the ‘American r’, and articulated as in En- 
glish Canadian pronunciation; and a final variants was voca/ized (r). This variant, most often found in the 
coda environment, though not restricted to it, is very frequent in the speech of many Montrealers, espe- 
cially in function words like sur and pour. 
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person re-code, then (in most cases) held a group session in which we reconciled 
the codings. For a handful of very difficult samples (in some instances because 
of poor sound quality), we reconciled the codings ourselves in the course of the 
analysis necessary for this paper. 

'Ihe next step was to code for the independent variables we predicted might 
condition the alternations for the variable speakers. In the present paper, we 
report our findings on stylistic conditioning for two speakers recorded in all 
three periods between 1971 and 1995. 


4. Individual [r] ~ [R] variability 


A first question to be asked is how typical is intra-individual variation? To 
provide a general assessment of this question, we examined all the speech 
samples (124) we have coded for (1) variability across all time periods. 


TOTAL 
Basic SAMPLE 1971 SPEAKER | 1984 SPEAKER | 1995 SPEAKER| oop up 
COMPOSITION SAMPLES SAMPLES SAMPLES 
SAMPLES 
ORIGINAL 1971 
4 4 1 11 
SPEAKERS B 3 2 0 
YOUNGER SPEAKERS 12 2 14 
ADDED IN 1984 
TOTAL 64 46 14 124 


Table 1 — All speech samples that form the pool for studying the conditioning of (r) variability. 


Since the general findings on change in progress led us to expect incremental 
change throughout the community, we were surprised to discover that the 
majority of speakers tend toward categorical use of one of the two variants. 
Eighty-three of the 124 speech samples (that is, 67%) exhibit categorical or near- 
categorical behavior on the part of the speakers (if near-categorical is defined as 
within 10 percentage points of 0% or 100%). Clermont & Cedergren's findings 
on the entire 1971 sample had also revealed most of the speakers to be close to 
096 or 10096, but we would have assumed that a real-time comparison would 
show more intermediate speakers, if the change progressed incrementally. 

Most of the near-categorical speakers of 1971 stayed that way in 1984, but a 
majority of the variable speakers moved towards categoriality. In Sankoff & 
Blondeau (2007), we divided our 32-speaker panel into ‘low’, ‘intermediate’ 
and 'high' users of the innovative [R] variant in 1971. Only 2 of the 12 
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‘low’ range users of [R] in 1971 had moved into the ‘intermediate’ range by 
1984. On the other hand, most of the ‘intermediate’ speakers of 1971 had 
moved into the ‘high’ range by 1984. That category increased from 12 to 18 
speakers by 1984, with more than half of the panelists now having become 
categorical or near-categorical users of innovative [R]. From the point of 
view of individuals, then, it seems that being in the intermediate range of 
[r] ~ [R] variability is a very unstable state, with most intermediate range 
speakers moving to categoriality over their lifetimes. 

Of the two speakers selected for the study of stylistic conditioning in the current 
paper, one (André L.) was in the intermediate range over all three time periods, 
whereas the other (Lysiane B.) was a virtually categorical user of the apical 
variant in 1971, and showed considerable change later in life. 


4.1 Stylistic conditioning of [r] ~ [R] variability 

The question addressed in this paper is whether speakers who have adopted 
the innovative [R] in variation with the traditional [r] also show sensitivity to 
stylistic considerations. Innovative [R] is a change from above, higher values 
of being associated with women and with higher linguistic market indices 
(Sankoff et al. 2001). Thus, it is reasonable to investigate whether speakers 
associate [R] with formal style, or youth, or women, or higher social class, and 
on the other hand, whether they associate [r] with being old or old-fashioned, 
or with intimacy or informality. We have modeled the change as one in which 
many speakers would have acquired [r] in primary acquisition in the family 
setting, adopting [n] later in childhood or adolescence under the influence of 
peers (Sankoff & Blondeau 2007). Thus stylistically, it is possible that speakers 
who have made such a change over their own lifetimes will associate the [r] 
variant with family and their own childhood. 

Of all the middle-range speakers, we chose two of those who were followed 
across the 24-year time span of the study for stylistic analysis. Both in their 
twenties in 1971, they belonged to the first generation of speakers who 
were at that time adopting innovative [R] as their basic consonantal variant. 
‘This was, however, more typical of middle and upper-middle class speakers 
(Sankoff et al. 2001), and the two we follow here were from working-class 
backgrounds. 

Lysiane B. (£7) at age 24 in 1971 was newly married, a factory worker who had 
not finished high school, but she and her husband were already planning a home 
in the suburbs and a better life for their family. As described in Blondeau et al. 
(2002), Lysiane by 1984 had forged a career in sales, and she, her husband and 
young daughter were indeed living in their suburban home. By 1995, Lysiane 
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had become a realtor, and projected self-confidence in her own mastery of her 
course in life, as well as pride in her daughter's accomplishments. 

André L. (#65) was 27 in 1971, single, and working in his chosen profession 
as an actor. He talks of his working-class father's aspirations for his children to 
achieve white-collar status with some job security, but explains how he himself 
(having finished high school, and recently graduated from a prestigious acting 
school) prefers living on a limited income with a meaningful profession. At 40, 
married with a toddler and a new baby, he was still following this financially 
unrewarding career path in 1984. By 1995, however, he had had to give up 
on acting and find a more certain source of income, and had shifted, as he 
explained in his interview, to gerontology, working as an animateur in a facility 
for senior citizens. Even with both himself and his wife working full time, he 
talks of financial worries supporting a family that now includes a teenager who 
needs music lessons. Despite these problems, André is clearly someone who 
finds much satisfaction in both his work and family life. 

What kind of diachronic trajectories do these two speakers have? For Lysiane, 
her dramatic upward social mobility seems to go hand in hand with a dramatic 
rise in her use of the innovative [n], from only 7% in 1971 to 65% in 1984, after 
which she steadily but more slowly continues to increase, registering a value of 
7596 [R] in 1995 (a statistically significant increase between 1984 and 1995). 
André, in contrast, was already a middle-range user of [R] in 1971. Though the 
overall values of [rR] reported for him increase slightly, from 6196 [r] in 1971, 
to 6696 in 1984, to 6996 in 1995, these slight increases were not statistically 
significant, leading us to conclude that André has been a stable mid-range user 
of [R] over the 24-year period of the study (a pattern atypical of our sample as 
a whole). 

To study stylistic variation, we increased the sample size for both these speakers, 
and searched as well for portions of their interviews that might be likely to 
show the most different behavior. Both speakers showed stylistic variation, but 
in different ways. Since Lysiane had close to categorical use of [r] in 1971, with 
only 796 [n], our stylistic analysis deals with her in 1984 and 1995, and André 
in 1971, 1984 and 1995. 

‘The results for Lysiane are reported in Table 2. We first studied three segments 
in her 1984 interview. We expected that a segment in which she recounts a 
conflict with the administration of her daughter's school might yield a higher 
rate of [R] than she uses in discussing more mundane topics, and this did prove 
to be the case. However, we also expected that she might show a significantly 
lessened use of [R] in the most emotional segment of the recording, one in 
which she narrates her daughter's harrowing experience with a near-fatal illness. 
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If Lysiane's use of [r] still represents her ‘vernacular in the sense of its being her 
dominant form throughout childhood and up through at least the age of 24, we 
reasoned that this very emotional story might lead her toward more vernacular 
usage. However, [R]-usage in this segment was not significantly different from 
its use in Lysiane's recounting of more mundane family history as shown in the 
first part of Table 2. Only in segment C is [R] use significantly different from — 
in this case more frequent than — the other two segments (whether considered 
separately or combined). 


No. or [r] | No. or [n] 26 [R]/ ALL TOKENS 
TOKENS TOKENS ([R]+[r]) 

A. MUNDANE FAMILY 22 21 51% 63 
Hisronv 

B. DaucHTER' NEAR- 87 45 6696 199 
FATAL ILLNESS 

C. CONFLICT WITH 41 14 75% 81 
SCHOOL AUTHORITIES 

TOTAL 1984 150 80 6596 343 

D. CONFLICTS WITH HER 37 18 6796 93 
MoTHER 

E. GRANDMOTHER'S 45 13 7896 97 
DEATH 

F. BUSINESS DECISIONS 68 20 77% 147 

'Torar 1995 150 51 7596 337 


Table 2 — [R] and [r] use by topic for Lysiane B. in 1984 and 1995. Tokens of [r] and [R] add 
to less than the total coded since non-consonantal variants included there did not enter into 
the percentage calculations. 


How can we explain why Lysiane’s behavior did not match our expectations in this 
regard? It may be that we misanalyzed the stylistic nature of segment B — for example, 
some of it concerns Lysiane’s dealing with doctors and hospital authorities, figures 
who may be parallel to the school authorities in segment C. However, separating 
this long segment into — for example — the utterances revealing Lysiane’s emotional 
responses from those involving reported conversations with authorities, did not reveal 
any particular patterning in her use of the two variants. For example, in (1), her use 
of [r] and [r] shows a preference for using [r] in codas and [r] in onsets*, but does 
not obey any stylistic constraints we could identify. (Other coda r's in clusters in this 
example were deleted and thus did not enter into the alternation at issue here). 


^ This is a general tendency we have identified for almost all of the ‘mid-range’ variable speakers we have an- 


alyzed, as discussed in Sankoff & Blondeau (in preparation). 
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(1) Ca peut étre n'impo[n]te qui It could be anybody who 
méme les membres de la famille qui even family members who 
peut étre-- être po[R]teurs du mic[r]obe could be, be carriers of the 
germ. 

Puis étant donné qu'elle, And given that she, 

elle était en faiblesse là avec ses otites she was weakened by her 
ear infections, 

elle l'a att[r]apé. she caught it. 


A more likely interpretation of these results is that apical [r] is no longer 
Lysiane's unmarked, vernacular pronunciation of (1). In her case, it seems that 
posterior [R] may yet carry the general implication ofa pronunciation associated 
with authority, education, and formality. The one subsection of her encounter 
with the school administration in which [r] co-occurs with a hyper-formal’ 
(and hypercorrect) form is in (2). When Lysiane confronts her daughter’s 
teacher about the lunch policy, asking her who exactly set the policy, the 
teacher's answer is reported as containing a liaison with infinitival [R] — in a 
sentence where it would probably have been the past participle which was used. 
Lysiane continues to report herself as having replied with another infinitival [n]. 
It would seem almost impossible to have scored this rhetorical coup using an 
apical [r] in the liaison, yet her emphasis here is on the fact of the liaison itself 
and not the particular variant of (r) used. 


(2) «Suzanne et moi en avons décidé[R] ainsi». "Suzanne and I have 
decided that way" 
J'ai dit I said, 
«Moi je vais en décide[R] autrement» “Tm going to decide 
otherwise" 
pour répondre: sur le méme «air». to reply with the same 
"air"/ T: 


Overall, however, what we see with Lysiane is that the formal passage contains 
7596 [R] use, without any individual subsections being particularly marked 
with [R] — perhaps difficult to do when what would be so marked would be the 
statistically unmarked form. Yet nor did the words in which [r] occurred here — 
or in her other passages — appear to be stylistically marked in any way. Lysiane 
raises her overall level of [R] use in dealing with a topic marked by formality, 


5 — Other hyper-formal elements in this short sentence include the object clitic en and the use of the first per- 


son plural verbal suffix —ons (where normally one would find Suzanne et moi, on a décidé). 
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yet individual tokens are not associated with a particular stylistic force. This 
resembles the situation for the negation in French where ze is associated with 
formality without being used all the time in formal contexts (Sankoff & Vincent 
1977). 

In 1995, we again studied three segments from Lysiane as illustrated above 
in Table 2. In none of these segments does [r] use differ significantly from 
the others. Her discussion of business decisions and difficulties with opening a 
dress shop (segment F) shows [R] use on a parallel with her 1984 segment on 
conflict with the authorities at her daughter's school. However, segments D and 
E, chosen to tap into Lysiane's most unself-conscious speech, showed [R] usage 
that is not significantly different from segment F. In D, she recounts how her 
mother was not happy living with Lysiane's family after being widowed, and in 
E (a passage which begins so emotionally that the tape recorder was turned off 
for a few minutes), she tells of her grandmother's death. Both of these passages 
seem to confirm that [R] is now part of Lysiane’s vernacular. This time, there are 
a few tokens of apical [r] in words that carry an ironic flavor, especially in the 
segment about conflict with her mother, but overall, stylistic variation seems not 
to be characteristic of Lysiane's use of [R] ~ [r] in 1995. 

André is a different story. Though stable across time, André's use of (r) 
variation seems more closely keyed to the use of individual tokens. Classified 
as a Middle Class speaker due to his high position on the linguistic market 
index counterbalancing for his working-class family background, André was an 
interesting case to study. Born in 1944 with several older siblings, we assume 
from his family background that André also acquired [r] in his primary language 
acquisition. However, he is unusual in having undergone training as an actor 
that included specific attention on the part of teachers and coaches from France 
whose mission it was to teach the Québécois actors to lose their local accents 
and speak 'international' French. In both 1971 and 1984, André speaks at 
length about his profession and in these segments, [r] is almost entirely absent, 
as shown in Table 3. Segment C differs significantly from A and B in 1971; 
Segment F in 1984 is virtually the same as the corresponding stylistic segment 
in 1971, and differs significantly from Segments D and E. In 1995, André was 
no longer working as an actor and did not talk about the theatre: Segment I, the 
most formal topic he discussed, differs significantly from G and H. 
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No. or [n] No. or [r] 96 Lel, ALL TOKENS 
TOKENS TOKENS ([r]+[r]) 

A. MUNDANE FAMILY 48 53 48% 137 
HISTORY (71) 

B. FINDING A POSITION 63 73 46% 185 
IN THE WORKFORCE 
C71) 

C. THE EXPERIENCE OF 99 10 91% 134 
THEATRE SCHOOL 
C71) 

Torta 1971 210 136 61% 456 

D. THE PLEASURES 40 30 57% 100 
OF LIFE IN THE 
COUNTRY (84) 

E. FAMILY GAMES AND 36 38 49% 118 
ENTERTAINMENT 

F. LIFE AND WORK IN 66 5 93% 95 
THE THEATRE (84) 

Torta 1984 142 73 6696 313 

G. FAMILY, FINANCIAL 60 32 65% 124 
WORRIES (95) 

H. PARTIES, DINNERS, 30 23 57% 90 
DRINKING (95) 

I. Porrcy, PoLIrIcs, 71 18 80% 124 
work (795) 

Torta 1995 161 73 6996 338 


Table 3 — [R] and [r] use by topic for André L. in 1971, 1984 and 1995. 


It is clear that André's stylistic range is greater than that of Lysiane. In the 
sections devoted to discussion of the theatre, alveolar [r] is almost completely 
absent. In these sections, he appears to use individual tokens of [r] for stylistic 
effect, reminiscent of Gumperz’ (1982) analysis of metaphorical code-switching. 
In one 3 1/2 minute segment from section C, there are only 3 alveolar tokens in 
an otherwise uninterrupted sequence of 80 posterior tokens. Two of the three 
occur in (3), where André switches from the exaggerated ‘French French’ accent 
he adopts for the words /'accent français in the second line, to the common 
Québécois expression sénerver ben gros. Both in énerver and in gros, André uses 
an apical [r], co-occurring with the usual pronunciation of bien without the 
glide whenever it is used in (this nonstandard) adverbial function. 
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(3) Tu comp[n]ends, le petit foula[R]d de côté — Y'know the little scarf on 


the side 
puis le: puis l'accent f[R]ancais and the «French» accent 
et puis tout le t[n]alala tu-sais. and all that brouhaha. 
Moi ca ménervait ben gros tu sais. It bugged the bell out of me. 
Bien ‘a fallu que j'app[R]enne à pa[n]ler I had to learn to speak 
le f[R]angais inte[R]national [. . .] International French [. . .] 
pa[R]ce-que quand tu joues du Molié[n]e because when you play 
Molière 
avec l'accent québécois là, with a québécois accent, 
tu es bloqué pas-mal. you're pretty much blocked. 


Later on he uses another expression clearly part of the Québécois vernacular 
when once more he evaluates himself, this time from the standpoint of some 
of his ambitious theatre school classmates, in the midst of a segment that is 
otherwise entirely characterized by posterior [n]. 


(4) Alo[n]s à ce moment là So at that point 
P 
ils me disaient que j'étais fou, they told me I was crazy, 
que j'étais pas ben ben brillant. that I was not too too bright. 


Here the phrase ben ben brillant uses the colloquial evaluative adverbial “ben 
ben’, never pronounced with a glide, along with the unique use of apical [r] in 
brillant. 

André’s use of the traditional Montreal [r] continues, albeit in a less unequivocal 
form, to be stylistically marked in other discourse that includes a much lower 
rate of [R] use. He tends to use a higher rate of alveolar [r] in contexts referring 
to the family and to childhood, whether his own or that of his own children. For 
example, in 1984, after a set of rather impersonal reflections on why he prefers 
country living to the city, featuring mainly posterior [R], he suddenly mentions 
the concrete experience of cross-country skiing with his toddler, saying that he 
loves to go out with his son on days when he doesn't have to work (all words in 
bold characters feature alveolar [r]): 


(5) Mes skis sont toujours en avant My skis are always in 
front (of the house) 
puis je pars. and so I leave. 
Puis je vais faire du ski de fond. And so I go cross-country 
skiing. 
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Puis là avec le petit, I have the little guy, 
la traîne-sauvage en arrière with the toboggan behind, 
puis je le traîne dans le bois and I pull him 
through the woods 
puis je l'occupe tout un aprés-midi. and I keep him busy 


all afternoon. 


Like Lysiane, in discussions of family history André intersperses apical and 
posterior variants throughout. But whereas for her, even a mini-concentration 
of three or four apical r's in a row does not in itself seem to carry any emotional 
association, in André's speech apical (r) often appears to cluster in utterances 
(though not necessarily particular words) that are especially imbued with 
emotion. These are usually positive but sometimes have a wryly ironic flavor as 
in (3) and (4) above. 

By 1995, André has left the theatre and nowhere does there occur a context 
in his interview in which he uses [n] as exclusively as in 1971 and 1984. His 
discussion of his work and of politics in segment I produces only 80% [r], a 
significant decline from the formal contexts of 1971 and 1984. However, we feel 
certain that were André once more to talk about his acting career, we would see 
the same more extreme stylistic range he demonstrated earlier in his life. 

Our interpretation of these results from André is that, as a trained actor who 
has been made sharply aware of dialect differences, he probably represents 
the upper limit of speakers’ ability to deploy the two (r) variants stylistically. 
'Ihis stylistic differentiation for André may be part of the explanation for 
why he remains a variable speaker and does not show evidence of an overall 
increase in [rR] between the age of 27 and 51. That stylistic variation rather than 
change over time is important for André can be seen from Figure 1, which 
plots all of André's segments by topic and year. Between age 27 and 51, André 
maintains a fairly consistent overall level of [R] in the range of 65% - 70%, but 
he also maintains clear stylistic differences as can be seen in Table 3 above. 
His phonological conditioning also remains stable, with codas yielding slightly 
higher percentages than onsets throughout — a much lesser difference than he 
exhibits in stylistic range. 
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100% 


90% + 


80% 4 


70% 4 


60% 4 


50% 4 


40% - vhi 
——O-— Family history 


30% 4 


20% + — Œ ~ Personal history 


10% - ——fN— Theatre/Work 
096 


1971 1984 1995 
Figure 1 — Percentage of [R] for André L. by topic and year. 


Comparing Lysiane with André, both speakers from a working-class 
background, we made the assumption that both acquired [r] as children. This 
was based on the fact that although some middle- and upper-class speakers 
in their 20s in 1971 tended to use [n] as their vernacular form, most working- 
class speakers were predominant users of [r] (Sankoff et al. 2001). When we 
met Lysiane in 1971, this was still her pattern at age 24. André at 25, with his 
theatre-school experience behind him, already showed a great stylistic range 
and a vernacular pattern in which the two forms were in variation. Over the 
next 24 years, Lysiane's upward social mobility was accompanied by a dramatic 
increase in her use of [R], but she shows only slight stylistic conditioning in 
1984, and none in 1995 when [R] seems to have replaced [r] in her vernacular. 
André on the other hand has not experienced upward social mobility and has 
not changed over time, but continues to show stylistic conditioning. 


5. Conclusions 


To date, there have been relatively few panel studies in which data on individuals 
has been reported over the span of a decade or more. Looking at vowel systems, 
Brink & Lund (1979); and Labov & Auger (1998) have shown stability in 
individual speakers, similar to the roughly 2/3 of our speakers who were stable 
across time. The majority of vowels of the speaker studied by Prince (1987, 
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1988) were also stable over 4 decades. In the domain of morphology, research 
on Montreal French auxiliary selection has shown stability in all but one or two 
of 60 speakers between 1971 and 1994, in the face of community change toward 
the use of être (Sankoff et al. 2004). Further work on the alternation between 
periphrastic and inflected future has found stability for the majority of the 
same 60 panel speakers, with upper class speakers showing retrograde change, 
increasing their use of the inflected future across their adult lives (Wagner & 
Sankoff 2011). In the alternation between a gente and the first person plural in 
Portuguese, Zilles (2005) reports that 11 of 13 speakers in a panel study across 
roughly two decades were stable in their use of a gente to replace the first person 
plural in Portuguese; the other two speakers showing retrograde change over 
their lifespans. Ashby (2001) reports that of 10 French speakers followed across 
a 19-year period, 6 were stable in their use of ne-deletion. Of the remaining 
four, three reduced their use of ze (the direction of community change) and 
one was anomalous in her increased use. A study of noun phrase agreement 
in Portuguese has also shown that across two decades, a sizeable minority of 
speakers (5 of 16) substantially increased their use of agreement — the direction 
of community change (Naro & Scherre 2002). 

Taken together, these panel studies demonstrate that although speaker stability 
in adult life seems to be the majority pattern, we frequently find a sizeable 
minority of speakers dramatically increasing their use of the innovative variant, 
with small minorities becoming more conservative as they age. 

Several of these studies have, like our study of Montreal (r), included a larger trend 
component along with a study of a subset of speakers as a panel. The studies of 
Ashby (2001), Naro & Scherre (2002), and Zilles (2005) concur with our research 
in two important respects: (1) community change outpaces that of individual 
speakers across their lifespans; and (2) in all these cases where options are binary, 
with no intermediate forms involved, change for individual speakers is often quite 
dramatic. It is possible that the fact that options are binary and discrete, in the [1] 
— [R] case as in the morphological alternations, makes possible the abrupt and 
rapid character of the change, as opposed to the slow and incremental nature of 
many of the vocalic changes described in previous research. 

‘The implications of these results in terms of whether individual grammar change 
in adult life is a matter only of quantitative change, or whether qualitative 
change is involved, is the topic of Sankoff (in preparation). What we can reliably 
say on this point at present is that most of those speakers who changed from 
intermediate range use of [R] to categorical or virtually categorical use also went 
from a grammar where onsets and codas differentially conditioned (r) variation 
to a grammar that lacked this conditioning. 
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Our analysis in this paper has concentrated on the middle phase of a very rapid 
change, investigating the stylistic conditioning of the variation. The sensitivity to 
stylistic conditioning has appeared to be complex, as illustrated by the detailed 
analysis of the alternation for two speakers across the lifespan. Those two 
speakers who acquired the apical variant as children are not equally sensitive to 
the stylistic environment. Our analysis has shown that one of the two speakers 
already manipulated the alternation of the variants for stylistic purposes at the 
age of 25 in 1971 due to his personal background as an actor, and maintained 
this ability in later life. However, the other speaker, who was still using her 
vernacular [r] pattern at the age of 24 in 1971, changed dramatically toward 
[R], probably due at least in part to her upward social mobility, without having 
showed a clear stylistic manipulation of the variants. In her case, it seems that 
one variant has replaced the other as the default variant. Further research using 
a combination of trend and panel study needs to be done on other variables 
involved in the process of change if we want to better understand the relation 
between stylistic markedness and the process of change. 
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